[SciPy-dev] [GSoC 2008]Machine learning package in SciPy

Anton Slesarev slesarev.anton@gmail....
Tue Mar 11 13:50:32 CDT 2008

Hi all,

it might be a good idea to have a machine learning(ML) package in SciPy. As
I understand there are some ML code in SciKits, but it is in raw state?

There are a lot of machine learning projects, with its own data format,
number of classifiers, feature selection algorithms and benchmarks. But if
you want to compare your own algorithm with some others, you should convert
your data format to input format of every tool you want to use and after
training, you should convert output format of each tools to the single
format to have facility to compare results(for example you want to see
common which features was selected by different tools).

Now I'm analyzing different ML approaches for the special case of text
classification problem. I couldn't find ML framework appropriate for my
task. I've got two simple requirements for this framework. It should support
sparse data format and has at least svm classifier. For example, Orange [1]
is a vary good data mining project but has poor sparse format support. PyML
[2] has all needed features, but there are problems with installation on
different platforms and code design is not perfect.

I believe that creation framework, which will be convenient for scientist to
integrate their algorithms to it, is a vary useful challenge. Scientists
often talk about standard machine learning software[3] and may be SciPy will
be appropriate platform for developing such software.

I can write detailed proposal, but I want to see is it interesting for
someone? Any wishes and recommendations?

1. Orange http://magix.fri.uni-lj.si/orange/
2. PyML http://pyml.sourceforge.net/
3. The Need for Open Source Software in Machine Learning

Anton Slesarev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20080311/808d4e6a/attachment.html 

More information about the Scipy-dev mailing list