|
Linguistic Annotation |
|
Documentation
|
|
The final report of the DEREKO project (ps.gz pdf)
outlines the DEREKO linguistic annotation concisely in chapter 3.
Details on the kind of information annotated in DEREKO, and on the way this
information is presented as an extension to the XCES encoding standard, are
given in the two following manuals:
- Frank Henrik Müller (2002). Shallow-Parsing Stylebook for German.
Technical Report. Seminar für Sprachwissenschaft, Universität Tübingen. (ps.gz pdf)
The Shallow Parsing Stylebook for German explains the syntactic
annotation at chunk, topological field, and clause level. It focuses on
explaining the strategies adopted to annotate unrestricted German language
robustly and reliably. It includes a discussion on phenomena out of scope of
the DEREKO annotation.
- Tylman Ule (2002). DEREKO Linguistic Markup. Technical
Report. Seminar für Sprachwissenschaft, Universität Tübingen. (ps.gz pdf)
The DEREKO Linguistic Markup manual explains all other aspects
of annotation, including POS markup, and tokenisation. The manual also shows
which XML elements and attributes encode which phenomena for all DEREKO
annotation within a sentence, including shallow parsing.
Please find some more related publications below:
- Tylman Ule and Frank H. Müller (2004): KaRoPars: Ein System zur
linguistischen Annotation großer Text-Korpora des Deutschen. In A. Mehler
und H. Lobin (Eds.): Automatische Textanalyse. Systeme und Methoden zur
Annotation und Analyse natürlichsprachlicher Texte. Opladen: Westdeutscher
Verlag.
- Erhard W. Hinrichs, Sandra Kübler, Frank H. Müller and Tylman Ule (2002):
A Hybrid Architecture for Robust Parsing of German. In Proceedings of
the Third International Conference on Language Resources and Evaluation (LREC
2002), Las Palmas, Gran Canaria, Mai 2002. (ps
pdf)
- Frank H. Müller and Tylman Ule (2002): Annotating topological fields
and chunks -- and revising POS tags at the same time. In Proceedings of the
Nineteenth International Conference on Computational Linguistics (COLING 2002),
Taipei, Taiwan, August 2002.
(ps
pdf)
- Jorn Veenstra, Frank H. Müller and Tylman Ule (2002): Topological
Fields Chunking for German. In Proceedings of the Sixth Conference on
Natural Language Learning (CoNLL 2002), Taipei, Taiwan, September 2002.
(ps
pdf)
- Frank H. Müller and Tylman Ule (2001): Satzklammer annotieren und Tags
korrigieren: Ein mehrstufiges Top-Down-Bottom-Up-System zur flachen, robusten
Annotierung von Sätzen im Deutschen. In Proceedings der
GLDV-Frühjahrstagung 2001, Gießen, März 2001, 235-244. (pdf)
|
|