[SciPy-dev] [GSoC 2008]Machine learning package in SciPy

Matthieu Brucher matthieu.brucher@gmail....
Tue Mar 11 14:05:24 CDT 2008


Hi,

David Cournapeau is maintaining the learn scikit. This is the main place
where machine learning code will be put.
For instance, there are classifiers (SVMs with libsvm) and there will be in
the near future the more used manifold learning techniques.

I didn't understand what you meant by "you want to see common which features
was selected by different tools".
Sparse matrix support must be made at the C level for libsvm, you would have
to ask Albert who wrapped libsvm.
For the manifold learning code, techniques that can support sparse matrices
support them (for instance Laplacian Eigenmaps).

Matthieu

2008/3/11, Anton Slesarev <slesarev.anton@gmail.com>:
>
> Hi all,
>
> it might be a good idea to have a machine learning(ML) package in SciPy.
> As I understand there are some ML code in SciKits, but it is in raw state?
>
> There are a lot of machine learning projects, with its own data format,
> number of classifiers, feature selection algorithms and benchmarks. But if
> you want to compare your own algorithm with some others, you should convert
> your data format to input format of every tool you want to use and after
> training, you should convert output format of each tools to the single
> format to have facility to compare results(for example you want to see
> common which features was selected by different tools).
>
> Now I'm analyzing different ML approaches for the special case of text
> classification problem. I couldn't find ML framework appropriate for my
> task. I've got two simple requirements for this framework. It should support
> sparse data format and has at least svm classifier. For example, Orange [1]
> is a vary good data mining project but has poor sparse format support. PyML
> [2] has all needed features, but there are problems with installation on
> different platforms and code design is not perfect.
>
> I believe that creation framework, which will be convenient for scientist
> to integrate their algorithms to it, is a vary useful challenge. Scientists
> often talk about standard machine learning software[3] and may be SciPy will
> be appropriate platform for developing such software.
>
> I can write detailed proposal, but I want to see is it interesting for
> someone? Any wishes and recommendations?
>
> 1. Orange http://magix.fri.uni-lj.si/orange/
> 2. PyML http://pyml.sourceforge.net/
> 3. The Need for Open Source Software in Machine Learning
> http://www.jmlr.org/papers/volume8/sonnenburg07a/sonnenburg07a.pdf
>
> --
> Anton Slesarev
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-dev
>
>


-- 
French PhD student
Website : http://matthieu-brucher.developpez.com/
Blogs : http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn : http://www.linkedin.com/in/matthieubrucher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20080311/919da3b1/attachment.html 


More information about the Scipy-dev mailing list