TreeTagger - A part-of-speech tagger

The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Dutch, Spanish, Bulgarian, Russian, Portuguese, Galician, Chinese, Swahili, Slovak, Slovenian, Latin, Estonian, Polish, Romanian, Coptic and old French texts and is adaptable to other languages if a lexicon and a manually tagged training corpus are available.
Sample output:

word  pos  lemma 
The  DT  the 
TreeTagger  NP  TreeTagger 
is  VBZ  be 
easy  JJ  easy 
to  TO  to 
use  VB  use 
SENT 
The TreeTagger can also be used as a chunker for English, German, French, and Spanish.More (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)

Comments

Popular posts from this blog

Databases on the FDA Website

IPEXL - New Patent Search Tool

Employee Retention – A critical issue, why..and How to solve?