DEREKO is a joint effort of
Acquisition Annotation Exploitation
IDS Mannheim
SfS Tübingen
IMS Stuttgart

Acquisition and Document Annotation

Please see IDS

Linguistic Annotation





Corpus Exploitation


Query Collection




Linguistic Annotation


The final report of the DEREKO project (ps.gz pdf) outlines the DEREKO linguistic annotation concisely in chapter 3.

Details on the kind of information annotated in DEREKO, and on the way this information is presented as an extension to the XCES encoding standard, are given in the two following manuals:

  • Frank Henrik Müller (2002). Shallow-Parsing Stylebook for German. Technical Report. Seminar für Sprachwissenschaft, Universität Tübingen. (ps.gz pdf)

    The Shallow Parsing Stylebook for German explains the syntactic annotation at chunk, topological field, and clause level. It focuses on explaining the strategies adopted to annotate unrestricted German language robustly and reliably. It includes a discussion on phenomena out of scope of the DEREKO annotation.

  • Tylman Ule (2002). DEREKO Linguistic Markup. Technical Report. Seminar für Sprachwissenschaft, Universität Tübingen. (ps.gz pdf)

    The DEREKO Linguistic Markup manual explains all other aspects of annotation, including POS markup, and tokenisation. The manual also shows which XML elements and attributes encode which phenomena for all DEREKO annotation within a sentence, including shallow parsing.

Please find some more related publications below:
  • Tylman Ule and Frank H. Müller (2004): KaRoPars: Ein System zur linguistischen Annotation großer Text-Korpora des Deutschen. In A. Mehler und H. Lobin (Eds.): Automatische Textanalyse. Systeme und Methoden zur Annotation und Analyse natürlichsprachlicher Texte. Opladen: Westdeutscher Verlag.
  • Erhard W. Hinrichs, Sandra Kübler, Frank H. Müller and Tylman Ule (2002): A Hybrid Architecture for Robust Parsing of German. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Gran Canaria, Mai 2002. (ps pdf)
  • Frank H. Müller and Tylman Ule (2002): Annotating topological fields and chunks -- and revising POS tags at the same time. In Proceedings of the Nineteenth International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, August 2002. (ps pdf)
  • Jorn Veenstra, Frank H. Müller and Tylman Ule (2002): Topological Fields Chunking for German. In Proceedings of the Sixth Conference on Natural Language Learning (CoNLL 2002), Taipei, Taiwan, September 2002. (ps pdf)
  • Frank H. Müller and Tylman Ule (2001): Satzklammer annotieren und Tags korrigieren: Ein mehrstufiges Top-Down-Bottom-Up-System zur flachen, robusten Annotierung von Sätzen im Deutschen. In Proceedings der GLDV-Frühjahrstagung 2001, Gießen, März 2001, 235-244. (pdf)

Please contact Tylman Ule for more information. Site last modified Sun Sep 26 2004.