[SciPy-user] Distributed Array Library?
Fri Apr 27 10:47:47 CDT 2007
Some thoughts and potential directions:
1. petsc4py is definitely worth looking at
2. Also pytrillinos is another really good parallel array/matrix library:
It seems very powerful and is well supported.
3. Global arrays
Robert Harrison at ORNL has python bindings to this. They probably
need updating, and I am not sure if/where they can be downloaded.
This could be very nice. It also might make sense to do a simple
ctypes wrapper for the global array library. I would be interested in
4. Someone could write low-level code using numpy+mpi4py
We (the IPython1 devs) have thought about this some. To provide basic
distributed arrays wouldn't be very difficult. The challenge is that
once you have such things, people will want things like eigensolvers,
linear solvers, etc. These wouldn't be as easy. But, there would be
an advantage. The overall focus of the above packages is that they
are focused on linear algebra (matrices). For higher rank tensors, I
am not sure they are that great. I would be really nice to have
something that was better for "real" tensor work. But it might make
sense to go with global arrays instead.
While IPython1 doesn't provide any distributed array library, it would
provide a very nice context in which to use one of the above
solutions. It would integrate seamlessly with any of the above and
enable interactive development/debugging/execution. Here are details:
It would be great to have a tutorial showing how to use petsc4py or
pytrillinos with ipython1. Any takers?
On 4/26/07, Gregory Crosswhite <firstname.lastname@example.org> wrote:
> Hey everyone! I would appreciate some advice on a problem I am facing.
> I have written a code using the numpy library that (among other
> things) performs contractions of a tensor network. Unfortunately, I
> have reached the point where my tensors are growing too big to handle
> in a single computer, so I want to rework my code so that it works on
> a cluster or grid.
> My question is: do you have suggestions for tools that would let me
> have ndarray like functionality with an array that could be
> distributed over many processors? Specifically, I would like to be
> able to create very large (possibly multi-gigabyte) tensors with an
> arbitrary number of dimensions, to be able to transpose indices and
> reshape dimensions, and to take general tensor products.
> After searching online, it looked like there was a package online
> called GlobalArrays that allows one to easily create distributed
> arrays, but it has the following characteristics that I would have to
> work around:
> *) No Python binding at present. (One used to exist, but it has
> disappeared from the internet. :-) )
> *) No capability for transposing indices or reshaping dimensions
> *) The distributed inner product operations do not take stride
> I also saw something called the Tensor Contraction Engine which might
> have some support for this kind of thing, but the documentation for
> the actual tensor contraction part of the system seemed very sparse
> so I cannot tell whether .
> I wonder whether it would be feasible to integrate something like
> this into the numpy core; I looked through the Guide to NumPy (thank
> to Travis for taking the time to write such comprehensive
> documentation!) and saw that there were various hooks to implement
> one's own type, along with operations to perform a dot product,
> ufuncs, and the like, but all of these seem to assume that one has a
> uniform memory layout so that adopting them for a distributed array
> would be an exercise in futility.
> Do the wise men and women of this list have any advice regarding the
> best tool to use? :-)
> Thank you very much in advance!
> - Gregory Crosswhite
> SciPy-user mailing list
More information about the SciPy-user