Machine Learning Demos in Mathematica®Most of the graphics in the slides used in the lecture Theoretical Concepts of Machine Learning have been created by Ulrich Bodenhofer using Mathematica®. As the functionality implemented in these Mathematica® notebooks may be useful for students, other lecturers, and perhaps even for machine learning practitioners, they have been made available for the public.
OverviewThe Mathematica® notebooks that are downloadable from this Web page are mainly for demonstrative purposes. They are not optimized for computation speed, memory usage, universality, or extensibility. The focus is mainly on creating illustrative pictures for courses in machine learning. While some of the implemented methods (e.g. k-nearest neighbor, support vector machines) work for higher dimensions, too, the visualizations are limited to problems with one or at most two input variables. Presently, the following features are supported:
- visualization of two-dimensional binary classification data sets and two-dimensional binary classification/discriminant functions
- k-nearest neighbor for binary classification and regression
- computation and visualization of Bayes-optimal decision boundaries for binary classification problems under the assumption that both classes are distributed according to bivariate normal distributions (and the creation of such data sets)
- computation of ROC curves and AUC values (also up to the k-th false positive, e.g. ROC50/AUC50)
- visualization and evaluation of support vector machines; note that the Mathematica® notebooks do not implement SVM training themselves, they can only load models that have been created previously by libSVM (through the Perl script ParseModelFile.pl
- Mathematica® notebooks (require at least version 6.0)
[ML-Mathematica.zip] (4 files; 5.1MB; last update 2009-10-09)
- Data Sets
[ML-DataSets.zip] (1776 files; 1.4MB; last update 2008-01-23)
- Perl script for converting a libSVM model file into a Mathematica® notebook
[ParseModelFile.pl] (3KB; last update 2008-01-23)
- Unzip the two ZIP archives into directories of your choice
- Open the file General.nb with Mathematica® (version 6.0 or newer)
- Move to section "Load Data Sets" and change the path in the SetDirectory command to the path where you have unzipped the data files to
UsageAny time you want to use one of the notebooks, load General.nb first and evaluate all cells therein before you evaluate any other notebook. That's it. You can easily use the notebooks with your own data. Explanations are provided inside the notebooks.
SVM DemosThe notebook SVM.nb loads Mathematica® notebooks that have been created by ParseModelFile.pl. The SVM-Demo sub-folder of the data set archive contains eight data sets in libSVM input format and, grouped into sub-folders DataSetX, a large number of libSVM model files and their Mathematica® counterparts. If you want to visualize your own SVMs, convert the model file into a Mathematica® notebook by ParseModelFile.pl and follow the instructions provided in SVM.nb. The use of ParseModelFile.pl is simple: it expects input from the console and writes output to the console, so you have to use input-output redirection to make file-to-file conversions.
Click on any of these thumbnails to view full-size images: