Tuesday, April 22, 2008

OpenNLP tools

I used the Open NLP tools for the following:
1) Tokenizing
2) Sentence Detection
3) Shallow parsing
4) Named Entity recognition

Although, i didnt find it pretty accurate for the Named Entity recognition task i found that it did a pretty good job on the Tokenizing, Sentence Detection and Shallow parsing part.

However, the model files dont seem to work on Windows Vista as it is unable to uncompress the same nor read the same. With Linux, it works perfectly.
Since i have not modified the same, using it as a command line option or calling from my program serves my purpose .

http://opennlp.sourceforge.net/

4 comments:

Tanya said...

Hi
I came across your post on NLP tools.Actually I am also looking for some open source softwares which can handle spelling mistakes, tokenization, stemming etc..I have tried so many softwares but none of them met my requirements. I am not good at either Linux or Python...so I am little hesitant to try out NLP toolkit.And I am not sure if it is worth spending time in that software.
Any suggestions would be appreciated.
Thanks.

Nisha said...

Hi Tanya, I am sorry i didnt see your comment earlier. The OpenNLP toolkit is a java based toolkit. I just use the command line so it works fine. Since i use linux, i havent tried too many windows tools.

Anonymous said...

May be this is an option:
http://nltk.sourceforge.net/index.php/Main_Page

Nisha said...

baijum81 - I use NLTK and python extensively in my work and research and unfortunately it does not meet my needs completely