[SciPy-dev] Abstract vectors in optimization

Ben FrantzDale benfrantzdale@gmail....
Tue Jan 6 07:00:24 CST 2009


Thanks for the reply. Let me describe the problem I've thought most about to
give you a sense of where I am coming from:

The optimization problem I have worked most with is atomistic simulation
(particularly relaxation) in which you have millions of atoms all
interacting with a force function. The force function is sparse but as the
atoms move, the sparseness pattern changes (with sparseness structures that
can be updated automatically when the state vector is "moved"). Furthermore,
the atoms can be distributed via MPI, so while they happen to be a size-3n
array of doubles locally, local atom 1,234 at one moment may not be the same
atom as local atom 1,234 at the next step. (I don't know the details of
PyMPI or how garbage collection might interact with MPI, but using SciPy for
massively-parallel optimization is another motiviation for this post.)

A common solver to use to relax the state of this sort of system is
nonlinear CG. However, I've seen many separate implementations of CG
hand-written to operate on these data structures. As a software engineer,
this is sad to see; really one nonlinear CG solver should be reusable enough
that it can be dropped in and used. I tried writing a sufficiently-generic
CG solver in C++, but designing the right family of types, particularly with
memory-management issues in mind, got out of hand. Given the cost function
is the expensive part of the sorts of problems I am interested in, the
overhead of Python is tiny, and SciPy has nice optimization libraries, which
lead me to consider generalizing SciPy functions.

I agree with Robert's comment about non-Euclidean metrics, such as working
with SO(3). Similarly, I have thought it would be interesting to optimize
function spaces (as in finite-element solvers) in which it should be
possible to add two functions over the same domain but with different
discretizations, so that __add__ would implicitly remesh one or the other
functions rather than having __add__ require the same basis functions on the
LHS and RHS so it can just add corresponding scalar coefficients.

As for the "few other things", the only example I tried changing is
nonlinear CG, and that really didn't require too many things. See below for
the SimpleHilbertVector class I tested with, which just hides a numpy array
behind a Hilbert-space interface.  Obviously methods that require
second-order approximations (i.e., a Hessian), such as Levenberg-Marquardt,
would need some way to represent the Hessian, so that is a method that would
at least require more thought to try to implement in an interface-only way.

As for such as slicing, broadcasting, etc. I didn't use of this in fmin_cg.
I see one use of slicing in optimize.py (in brute), but nothing about
broadcasting. Can you point to what you mean?


# Appologies to whatever degree the following is non-Pythonic:
class HilbertVector:
    def norm(self, norm = 2):
        if norm == 2:
            return sqrt(self.inner_product(self))

class SimpleHilbertVector(HilbertVector):
    This is a simple Hilbert vector.
    def __init__(self, arr):
        if isinstance(arr, SimpleHilbertVector):
            self.data = arr.data
            self.data = array(arr)

    def inner_product(self, other):
        return dot(self.data, other.data)

    def norm(self, norm = 2):
            HilbertVector.norm(self, norm)
            if norm == Inf:
                return max(abs(self.data))

    def __mul__(self, other):
        return SimpleHilbertVector(self.data.__mul__(other))

    def __div__(self, other):
        return SimpleHilbertVector(self.data.__div__(other))

    def __truediv__(self, other):
        return SimpleHilbertVector(self.data.__truediv__(other))

    def __rdiv__(self, other):
        return SimpleHilbertVector(self.data.__rdiv__(other))

    def __rmul__(self, other):
        return SimpleHilbertVector(self.data.__mul__(other))

    def __add__(self, other):
        return SimpleHilbertVector(other.data.__radd__(self.data))

    def __radd__(self, other):
        return SimpleHilbertVector(other.data.__add__(self.data))

    def __sub__(self, other):
        return SimpleHilbertVector(other.__rsub__(self.data))

    def __rsub__(self, other):
        return SimpleHilbertVector(other.__sub__(self.data))

    def __neg__(self): return SimpleHilbertVector(self.data.__neg__())

    def __len__(self): return self.data.__len__()

    def __str__(self): return self.data.__str__() + " (SimpleHilbertVector)"

On Tue, Jan 6, 2009 at 12:27 AM, David Cournapeau <
david@ar.media.kyoto-u.ac.jp> wrote:

> Hi Ben,
> Ben FrantzDale wrote:
> > I've started playing with SciPy for numerical optimization. So far, it
> > looks like a great set of useful algorithm implementations.
> >
> > If I understand the architecture correctly, there is one change that
> > could make these libraries even more powerful, particularly for
> > large-scale scientific computing. The optimization algorithms seem to
> > be designed to work only with NumPy vectors. Mathematically, though,
> > many of these algorithms can operate on an arbitrary Hilbert space, so
> > the algoirhtms can be slightly rewritten to work with any objects that
> > support addition, scalar multiplication, and provide an inner product,
> > and a few other things.
> It may be true mathematically, but I don't think it is such a useful
> abstraction for implementation. The few other things is what matters the
> most, actually. For example, many if not most scipy algorithms depend on
> advanced features provided by numpy, such as slicing, broadcasting,
> etc... Implementing everything in scipy in terms of an abstract object
> with only the operations implied by Euclidean structures would be quite
> awkward.
> >
> > If these functions could be made more general in this way, I could see
> > the SciPy optimization tools being extremely useful for large-scale
> > optimization problems in which people already have the cost function,
> > its gradient, and the representation of a state vector implemented
> > (e.g., in C++).
> You can already do that with some effort, depending on the memory
> representation of your state vector: you make a numpy array from your
> data (you can of course almost always convert to a numpy array anyway,
> but I assume you want to avoid copies, since you talk about large-scale
> optimization).
> I want to ask: why would you *not* want to use numpy ? What does it
> bring to you ?
> cheers,
> David
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-dev/attachments/20090106/2bd143ff/attachment-0001.html 

More information about the Scipy-dev mailing list