Classification Tools
- Bow: A Toolkit for
Statistical Language Modeling, Text Retrieval, Classification and Clustering
- Bow home page.
- Rainbow is
an executable program that does document classification. While mostly
designed for classification by naive Bayes, it also provides TFIDF/Rocchio,
Probabilistic Indexing and K-nearest neighbor.
- Arrow is an
executable program that does document retrieval. It currently only
performs simple TFIDF-based retrieval.
- Crossbow
is a an executable program that does document clustering (and also
classification).
-
Mallet MALLET is an integrated collection of Java code useful for statistical
natural language processing, document classification, clustering,
information extraction, and other machine learning applications to text.
- A nice
intro (http://courses.washington.edu/ling572/winter07/homework/mallet_guide.pdf)
to Mallet written by
Fei Xia (http://faculty.washington.edu/fxia/)
at University of Washington for an NLP course.
- Software
Similar to MALLET for machine learning applied to text
-
Developer's corner
- tools for text
Shogun - A Large Scale Machine
Learning Toolbox
http://www.shogun-toolbox.org/
Machine Learning Open Source Software.
http://mloss.org/software/
Matlab Tools
¡@