python-weka-wrapper: 0.3.0 released

It’s been a while since the last release and there were quite a number of bugfixes and additions this time (eg database access, text mining), so well worth the upgrade. A major addition is the workflow component, encapsulating a lot of the functionality that the python-weka-wrapper library offers in a convenient way. Check out the examples in the examples repository, src/wekaexamples/flow sub-directory. The workflow was inspired by the one available from the ADAMS framework, though much simpler. It is a pure-Python implementation and has nothing to do with Weka’s KnowledgeFlow.

Changes:

  • added method “ndarray_to_instances” to “weka.converters” module for converting Numpy 2-dimensional array into “Instances” object
  • added method “plot_learning_curve” to “weka.plot.classifiers” module for creating learning curves for multiple classifiers for a specific metric
  • added plotting of experiments with “plot_experiment” methid in “weka.plot.experiments” module
  • “Instance.create_instance” method now takes list of tuples (index, internal float value) when generating sparse instances
  • added “weka.core.database” module for loading data from a database
  • added “make_copy” class method to “Clusterer” class
  • added “make_copy” class method to “Associator” class
  • added “make_copy” class method to “Filter” class
  • added “make_copy” class method to “DataGenerator” class
  • most classes (like Classifier and Filter) now have a default classname value in the constructor
  • added “TextDirectoryLoader” class to “weka.core.converters”
  • moved all methods from “weka.core.utils” to “weka.core.classes”
  • fixed “Attribute.index_of” method for determining label index
  • fixed “Attribute.add_string_value” method (used incorrect JNI parameter)
  • “create_instance” and “create_sparse_instance” methods of class “Instance” now ensure that list values are float
  • added “to_help” method to “OptionHandler” class which outputs a help string generated from the base class’s “globalInfo” and “listOptions” methods
  • fixed “test_model” method of “Evaluation” class when supplying a “PredictionOutput” object (previously generated “No dataset structure provided!” exception)
  • added “batch_finished” method to “Filter” class for incremental filtering
  • added “line_plot” method to “weka.plot.dataset” module for plotting dataset using internal format (one line plot per instance)
  • added “is_serializable” property to “JavaObject” class
  • added “has_class” convenience property to “Instance” class
  • added “__repr__” method to “JavaObject” classes (simply calls “toString()” method)
  • added “Stemmer” class in module “weka.core.stemmers”
  • added “Stopwords” class in module “weka.core.stopwords”
  • added “Tokenizer” class in module “weka.core.tokenizers”
  • added “StringToWordVector” filter class in module “weka.filters”
  • added simple workflow engine (see documentation on Flow)