[SciPy-dev] [GSoC 2008]Machine learning package in SciPy
Stéfan van der Walt
Tue Mar 11 22:01:44 CDT 2008
On Tue, Mar 11, 2008 at 11:50 AM, Anton Slesarev
> Hi all,
> it might be a good idea to have a machine learning(ML) package in SciPy. As
> I understand there are some ML code in SciKits, but it is in raw state?
> There are a lot of machine learning projects, with its own data format,
> number of classifiers, feature selection algorithms and benchmarks. But if
> you want to compare your own algorithm with some others, you should convert
> your data format to input format of every tool you want to use and after
> training, you should convert output format of each tools to the single
> format to have facility to compare results(for example you want to see
> common which features was selected by different tools).
> Now I'm analyzing different ML approaches for the special case of text
> classification problem. I couldn't find ML framework appropriate for my
> task. I've got two simple requirements for this framework. It should support
> sparse data format and has at least svm classifier. For example, Orange 
> is a vary good data mining project but has poor sparse format support. PyML
>  has all needed features, but there are problems with installation on
> different platforms and code design is not perfect.
> I believe that creation framework, which will be convenient for scientist to
> integrate their algorithms to it, is a vary useful challenge. Scientists
> often talk about standard machine learning software and may be SciPy will
> be appropriate platform for developing such software.
> I can write detailed proposal, but I want to see is it interesting for
> someone? Any wishes and recommendations?
> 1. Orange http://magix.fri.uni-lj.si/orange/
> 2. PyML http://pyml.sourceforge.net/
> 3. The Need for Open Source Software in Machine Learning
I also recently learned of Elefant,
but haven't had a chance to investigate in more detail.
More information about the Scipy-dev