[SciPy-Dev] SciPy Goal

Warren Weckesser warren.weckesser@enthought....
Thu Jan 5 00:02:19 CST 2012


On Wed, Jan 4, 2012 at 9:29 PM, Travis Oliphant <travis@continuum.io> wrote:

>
> On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote:
>
> > Hi all,
> >
> > On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis@continuum.io>
> wrote:
> >> What do others think is missing?  Off the top of my head:   basic
> wavelets
> >> (dwt primarily) and more complete interpolation strategies (I'd like to
> >> finish the basic interpolation approaches I started a while ago).
> >> Originally, I used GAMS as an "overview" of the kinds of things needed
> in
> >> SciPy.   Are there other relevant taxonomies these days?
> >
> > Well, probably not something that fits these ideas for scipy
> > one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View
> > from Berkeley' paper on parallel computing is not a bad starting
> > point; summarized here they are:
> >
> >    Dense Linear Algebra
> >    Sparse Linear Algebra [1]
> >    Spectral Methods
> >    N-Body Methods
> >    Structured Grids
> >    Unstructured Grids
> >    MapReduce
> >    Combinational Logic
> >    Graph Traversal
> >    Dynamic Programming
> >    Backtrack and Branch-and-Bound
> >    Graphical Models
> >    Finite State Machines
>
>
> This is a nice list, thanks!
>
> >
> > Descriptions of each can be found here:
> > http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is
> > here:
> >
> > http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
> >
> > That list is biased towards the classes of codes used in
> > supercomputing environments, and some of the topics are probably
> > beyond the scope of scipy (say structured/unstructured grids, at least
> > for now).
> >
> > But it can be a decent guiding outline to reason about what are the
> > 'big areas' of scientific computing, so that scipy at least provides
> > building blocks that would be useful in these directions.
> >
>
> Thanks for the links.
>
>
> > One area that hasn't been directly mentioned too much is the situation
> > with statistical tools.  On the one hand, we have the phenomenal work
> > of pandas, statsmodels and sklearn, which together are helping turn
> > python into a great tool for statistical data analysis (understood in
> > a broad sense).  But it would probably be valuable to have enough of a
> > statistical base directly in numpy/scipy so that the 'out of the box'
> > experience for statistical work is improved.  I know we have
> > scipy.stats, but it seems like it needs some love.
>
> It seems like scipy stats has received quite a bit of attention.   There
> is always more to do, of course, but I'm not sure what specifically you
> think is missing or needs work.



Test coverage, for example.  I recently fixed several wildly incorrect
skewness and kurtosis formulas for some distributions, and I now have very
little confidence that any of the other distributions are correct.  Of
course, most of them probably *are* correct, but without tests, all are in
doubt.

Warren


   A big question to me is the impact of data-frames as the underlying
> data-representation of the algorithms and the relationship between the
> data-frame and a NumPy array.
>
> -Travis
>
>
> >
> > Cheers,
> >
> > f
> > _______________________________________________
> > SciPy-Dev mailing list
> > SciPy-Dev@scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-dev
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120105/6cda4551/attachment.html 


More information about the SciPy-Dev mailing list