snowball stemmers release
Just made a new release of the Snowball stemmers Weka package available, version 1.0.1, which is just a minor release: depends on Weka 3.7.12 now fixed Maven integration added unit tests for all stemmers
Open-Source and related Stuff
Just made a new release of the Snowball stemmers Weka package available, version 1.0.1, which is just a minor release: depends on Weka 3.7.12 now fixed Maven integration added unit tests for all stemmers
Despite WEKA offering attribute and instance weights, you could only set them programmatically or manually fiddling with ARFF/XRFF/JSON files. This Weka list post prompted me today, to quickly hack together a WEKA package with filters that allow setting the weights…
Read more
Just released a new version of my new Weka package for natural language processing (NLP): https://github.com/fracpete/nlp-weka-package Changes: added example parser model: wekafiles/packages/nlp/models/englishPCFG.ser.gz added Explorer tab for experimenting with parser setups and visualizing the parse trees Here is a screenshot of…
Read more
Something that ADAMS’ Preview Browser has had for years, I’ve now added to Weka as a standalone tab in the Explorer: displaying the content of serialized model files. It allows the user to load a serialized model file (or actually…
Read more
Just released the first version of my new Weka package for natural language processing (NLP): https://github.com/fracpete/nlp-weka-package At the moment, it contains only some filters (ChangeCase, PartOfSpeechTagging) and tokenizers (WhiteSpaceTokenizer, PTBTokenizer). It uses the Stanford parser for the NLP heavy lifting.
Petr came across a bug that affected the output of the predictions generated from the test set. It worked fine for cross-validation, but not for the Random split and Unlabled/Test set modes. I’ve committed a fix and made a new…
Read more
Just pushed out a new release of the collective-classification Weka package, incorporating feedback from the Weka mailing list. Changes: Explorer panel now offers loading of unlabeled/test set in Unlabeled/test set mode classifiers now create copies of train/test sets in build…
Read more
A long time ago, I added a Weka meta-classifier for parameter optimization called MultiSearch. In contrast to GridSearch, which forces you to optimize two parameters (hence grid), this scheme allows you to optimize an arbitrary number of parameters. However, it…
Read more
Just released a new version of the confusionmatrix Weka package that I started a while ago. I added a new heatmap visualization, which scale the rows by the sum of counts in that row. Essentially using percentages in a row,…
Read more
Just made a new maintenance release available for the collective-classification project: it now works with Weka 3.7.11. You can download the Weka package from here: https://drive.google.com/folderview?id=0B4q6REcT3R4WcmN0bElLRHJUbHc&usp=sharing