TreeTagger - A part-of-speech tagger
The TreeTagger is a tool for annotating text with part-of-speech
and lemma information. It was developed by Helmut Schmid in
the TC
project at the Institute for Computational Linguistics of the
University of Stuttgart. The TreeTagger has been successfully used to
tag German, English, French, Italian, Dutch, Spanish, Bulgarian,
Russian, Portuguese, Galician, Chinese, Swahili, Slovak, Slovenian,
Latin, Estonian, Polish, Romanian, Coptic and old French
texts and is adaptable to other languages if a lexicon and a manually
tagged training corpus are available.
Sample output:
The TreeTagger can also be used as a chunker for English, German,
French, and Spanish.More (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)
Sample output:
word | pos | lemma |
---|---|---|
The | DT | the |
TreeTagger | NP | TreeTagger |
is | VBZ | be |
easy | JJ | easy |
to | TO | to |
use | VB | use |
. | SENT | . |
Comments