DEREKO
DEREKO is a joint effort of
Acquisition Annotation Exploitation
IDS Mannheim
SfS Tübingen
IMS Stuttgart









Acquisition and Document Annotation


Please see IDS




Linguistic Annotation


Introduction


Documentation


Sample


Contact




Corpus Exploitation


Introduction


Query Collection


Documentation


Sample


Contact







Corpus Exploitation

Query Collection

The Query collection is grouped in two layers:
  • The first layer of queries builts recursive syntactic structures on top of the chunk annotation (see Shallow-Parsing Stylebook for German for more details).
  • The second layer of queries extracts corpus evidence for specific lexicographic tasks.
The query results of the first layer are added to the corpus annotation for efficiency reasons. Thus, the basic syntactic analyses do not have to be reconstructed for each extraction query. The queries identify, among others, the following syntactic phrases:
  • pre-head recursive embedding, e.g., complex APs involving embedding of PPs and NPs.
  • post-head recursive embedding of genitive NPs and named entities
Neither construction is part of the chunk analysis.

Besides, certain lexical properties of terminal nodes are added. These properties are projected from the head to the chunk. The lexical information is used during the extraction process either to look for chunks with specific annotations or to exclude them.

The second layer of queries make use of both the chunk annotation and the information added by the first layer queries in order to extract corpus evidence for lexicographic and linguistic purposes.





Please contact kcl@ims.uni-stuttgart.de for more information. Site last modified Sun Sep 26 2004.