UIMA/U-Compare GENIA Tokeniser (GENIA Tagger)



Tokenisation is one of the functionalities of the GENIA tagger, which additionally outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts.
The tool is a UIMA component, which forms part of the in-built library of components provided with the U-Compare platform see separate META-SHARE record) for building and evaluating text mining workflows.

  • U-Compare Workbench
