Previous: Machine Learning, Up: Machine Learning
keaThe previous version of Marsyas 0.1 contained machine learning functionality but until 2007 the new version 0.2 mostly relied on Weka for machine learning experiments. Although this situation was satisfactory for writing papers it was not possible to create real-time networks integrating machine learning. Therefore an effort was made to establish programming conventions for how machine learning MarSystems should be implemented. Last but not least we have always wanted to have as much functionality related to audio processing systems implemented natively in Marsyas.
kea is one of the outcomes of this effort. Kea (a rare bird
from New Zealand) is the Marsyas counterpart of Weka
and provides similar capabilities with the command-line
interface to Weka although much more limited (at least
for now).
Any weka .arff file can be used as input to kea
although ususally the input is the extracted .arff
files from bextract. The following command-line options
are supported.
The main mode (train) basically performs 10-fold non-stratified cross-validation to evaluate the classification performance of the specified classifier on the provided .arff file. In addition to classification accuracy It outputs several other summary measures of the classifier's performance as well as the confusion matrix. The format of the output is similar to Weka.
The mode distance_matrix is used to compute a NxN similarity matrix
based on the input .arff file containing N feature vector instances. The
output format is the one used for MIREX 2007 music similarity task. This
functionality relies on specific naming conventions related to the
Marsyas MIREX2007 submission. By default the output goes to dm.txt but
can be specified by the -dm command-line option. The following
examples show different ways kea can be used.
The pca mode reduces the input feature vectors by projecting them to the first 3 principal components using Principal Component Analysis (PCA). Each component is normalized to lie in the range [0-512]. The resulting transformed features are simply written to stdout.
kea -w iris.arff
kea -m train -w iris.arff -cl SVM
kea -m distance_matrix -dm dmatrix.txt -w iris.arff
kea -m pca -w iris.arff