[Numpy-discussion] slow numpy.clip ?
david at ar.media.kyoto-u.ac.jp
Tue Dec 19 20:30:45 CST 2006
Travis Oliphant wrote:
> Robert Kern wrote:
>> David Cournapeau wrote:
>>> Basically, at least from those figures, both versions are pretty
>>> similar, and not worth improving much anyway for matplotlib. There is
>>> something funny with numpy version, though.
>> Looking at the code, it's certainly not surprising that the current
>> implementation of clip() is slow. It is a direct numpy C API translation of the
>> following (taken from numarray, but it is the same in Numeric):
>> def clip(m, m_min, m_max):
>> """clip() returns a new array with every entry in m that is less than m_min
>> replaced by m_min, and every entry greater than m_max replaced by m_max.
>> selector = ufunc.less(m, m_min)+2*ufunc.greater(m, m_max)
>> return choose(selector, (m, m_min, m_max))
> There are a lot of functions that are essentially this. Many things
> were done to just get something working. It would seem like a good idea
> to re-code many of these to speed them up.
>> Creating that integer selector array is probably the most expensive part.
>> Copying the array, then using putmask() or similar is certainly a better
>> approach, and I can see no drawbacks to it.
>> If anyone is up to translating their faster clip() into C, I'm more than happy
>> to check it in. I might also entertain adding a copy=True keyword argument, but
>> I'm not entirely certain we should be expanding the API during the 1.0.x series.
> The problem with the copy=True keyword is that it would imply needing to
> expand the C-API for PyArray_Clip and should not be done until 1.1 IMHO.
> We would probably be better off not expanding the keyword arguments to
> methods as well until that time.
When I went back to home, I started taking a close look a numpy/core C
sources, with the help of the numpy ebook. The huge source files make it
really difficult for me to follow some things: I was wondering if there
is some rationale behind it, or if this is just a remain of old
developments of numpy.
The main problem I have with those huge files is that I am confused
between the functions parts of the public API, the one for backward
compatibility, etc... I wanted to extract the PyArray_TakeFom function
to see where the time is spent, but this is quite difficult, because of
My question is then: is there any plan to change this ? If not, is this
for some reasons I don't see, or is this just because of lack of manpower ?
More information about the Numpy-discussion