Download PDFOpen PDF in browserModeling Non-Compositional Expressions using a Search EngineEasyChair Preprint 4186 pages•Date: August 9, 2018AbstractNon-compositional multi-word expressions present great challenges to natural language processing applications. In this paper, we present a method for modeling non-compositional expressions based on the assumption that the meaning of expressions depends on context. Therefore, context words can be used to select documents and separate documents where the expression has different meanings. Deviation from a baseline is measured using serendipity (i.e. the pointwise effect size). We used this statistical measure to mark which patterns are over- and under-represented and to take a decision if the pattern under scrutiny belongs to the meaning selected by the context words or not. We used the Google search engine to find document frequency estimates. When used with Google document frequency estimates, the serendipity measure closely mirrors some human intuitions on the preferred alternative. Keyphrases: Context Word, Frequency Machine, Natural Language Processing, Non-compositional, Serendipity, compositional meaning, compositional multi word expression, computational linguistic, conjunction fallacy, effect size, expected frequency, memory-based learning, multiword expressions, non compositional expression, non-compositional meaning, search engine, statistics
|