# [SciPy-Dev] SciPy Goal

Fernando Perez fperez.net@gmail....
Wed Jan 4 20:22:16 CST 2012

```Hi all,

On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis@continuum.io> wrote:
> What do others think is missing?  Off the top of my head:   basic wavelets
> (dwt primarily) and more complete interpolation strategies (I'd like to
> finish the basic interpolation approaches I started a while ago).
> Originally, I used GAMS as an "overview" of the kinds of things needed in
> SciPy.   Are there other relevant taxonomies these days?

Well, probably not something that fits these ideas for scipy
one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View
from Berkeley' paper on parallel computing is not a bad starting
point; summarized here they are:

Dense Linear Algebra
Sparse Linear Algebra [1]
Spectral Methods
N-Body Methods
Structured Grids
Unstructured Grids
MapReduce
Combinational Logic
Graph Traversal
Dynamic Programming
Backtrack and Branch-and-Bound
Graphical Models
Finite State Machines

Descriptions of each can be found here:
http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is
here:

http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html

That list is biased towards the classes of codes used in
supercomputing environments, and some of the topics are probably
beyond the scope of scipy (say structured/unstructured grids, at least
for now).

But it can be a decent guiding outline to reason about what are the
'big areas' of scientific computing, so that scipy at least provides
building blocks that would be useful in these directions.

One area that hasn't been directly mentioned too much is the situation
with statistical tools.  On the one hand, we have the phenomenal work
of pandas, statsmodels and sklearn, which together are helping turn
python into a great tool for statistical data analysis (understood in
a broad sense).  But it would probably be valuable to have enough of a
statistical base directly in numpy/scipy so that the 'out of the box'
experience for statistical work is improved.  I know we have
scipy.stats, but it seems like it needs some love.

Cheers,

f
```