OpenNLP is a collection of Java-based NLP tools for pos-tagging, sentence detection, tokenization, chunking and parsing, named-entity detection, and co-reference.
All is done using the OpenNLP Maxent machine learning toolkit.
The goal of the OpenNLP project will be to create a mature toolkit for the abovementioned tasks.
An additional goal is to provide a large number of pre-built models for a variety of languages, as well as the annotated text resources that those models are derived from.
What's New in This Release: [ read full changelog ]
· Improved the white space handling in the Sentence Detector and its
· training code
· Added more cross validator command line tools
· Command line handling code has been refactored
· Fixed problems with the new build
· Now uses fast token class feature generation code by default
· Added support for BioNLP/NLPBA 2004 shared task data
· Removal of old and deprecated code
· Dictionary case sensitivity support is now done properly
· Support for OSGi