Christiane FELLBAUM & Alexander GEYKEN (Princeton, États-Unis / Berlin, Allemagne)
Transforming a Corpus into a Lexical Resource. The Berlin Idiom Project
2005, Vol. X-2, pp. 49-62
We discuss the goals and methods of the lexicographic project "Collocations and Idioms in German" at the Berlin-Brandenburg Academy of Sciences. A very large corpus is tagged and parsed to enable flexible searches for target structures, more specifically German verb phrase idioms. On the basis of relevant tokens, an extensive linguistic-lexicographic analysis is performed and recorded on a set of structured forms, which comprise a kind of digital dictionary entry for the target structure. For transparency and future research, each recorded linguistic-lexicographic phenomenon is linked with appropriate corpus tokens. The resulting resource, which combines an exhaustive description of the idioms' properties with corpus tokens, allows for multiple search types.