[Numpy-discussion] Profiling numpy ? (parts written in C)

Eric Firing efiring at hawaii.edu
Wed Dec 20 00:00:53 CST 2006


John,

The current version of __call__ already includes substantial speedups 
prompted by David's profiling, and if I understand correctly the present 
bottleneck is actually the numpy take function.  That is not to say that 
other improvements can't be made, of course.

Eric

John Hunter wrote:
>>>>>> "David" == David Cournapeau <david at ar.media.kyoto-u.ac.jp> writes:
>     David> At the end, in the original context (speeding the drawing
>     David> of spectrogram), this is the problem. Even if multiple
>     David> backend/toolkits have obviously an impact in performances,
>     David> I really don't see why a numpy function to convert an array
>     David> to a RGB representation should be 10-20 times slower than
>     David> matlab on the same machine.
> 
> This isn't exactly right.  When matplotlib converts a 2D grayscale
> array to rgba, a lot goes on under the hood.  It's all numpy, but it's
> far from single function and it involves many passes through the
> data.  In principle, this could be done with one or two passes through
> the data.  In practice, our normalization and colormapping abstractions
> are so abstract that it is difficult (though not impossible) to
> special case and optimize.
> 
> The top-level routine is
> 
>     def to_rgba(self, x, alpha=1.0):
>         '''Return a normalized rgba array corresponding to x.
>         If x is already an rgb or rgba array, return it unchanged.
>         '''
>         if hasattr(x, 'shape') and len(x.shape)>2: return x
>         x = ma.asarray(x)
>         x = self.norm(x)
>         x = self.cmap(x, alpha)
>         return x
> 
> which implies at a minimum two passes through the data, one for norm
> and one for cmap.
> 
> In 99% of the use cases, cmap is a LinearSegmentedColormap though
> users can define their own as long as it is callable.  My guess is
> that the expensive part is Colormap.__call__, the base class for
> LinearSegmentedColormap.  We could probably write some extension code
> that does the following routine in one pass through the data.  But it
> would be hairy.  In a quick look and rough count, I see about 10
> passes through the data in the function below.
> 
> If you are interested in optimizing colormapping in mpl, I'd start
> here.  I suspect there may be some low hanging fruit.
> 
>     def __call__(self, X, alpha=1.0):
>         """
>         X is either a scalar or an array (of any dimension).
>         If scalar, a tuple of rgba values is returned, otherwise
>         an array with the new shape = oldshape+(4,). If the X-values
>         are integers, then they are used as indices into the array.
>         If they are floating point, then they must be in the
>         interval (0.0, 1.0).
>         Alpha must be a scalar.
>         """
>         if not self._isinit: self._init()
>         alpha = min(alpha, 1.0) # alpha must be between 0 and 1
>         alpha = max(alpha, 0.0)
>         self._lut[:-3, -1] = alpha
>         mask_bad = None
>         if not iterable(X):
>             vtype = 'scalar'
>             xa = array([X])
>         else:
>             vtype = 'array'
>             xma = ma.asarray(X)
>             xa = xma.filled(0)
>             mask_bad = ma.getmask(xma)
>         if typecode(xa) in typecodes['Float']:
>             putmask(xa, xa==1.0, 0.9999999) #Treat 1.0 as slightly less than 1.
>             xa = (xa * self.N).astype(Int)
>         # Set the over-range indices before the under-range;
>         # otherwise the under-range values get converted to over-range.
>         putmask(xa, xa>self.N-1, self._i_over)
>         putmask(xa, xa<0, self._i_under)
>         if mask_bad is not None and mask_bad.shape == xa.shape:
>             putmask(xa, mask_bad, self._i_bad)
>         rgba = take(self._lut, xa)
>         if vtype == 'scalar':
>             rgba = tuple(rgba[0,:])
>         return rgba
> 
> 
> 
>  
> 
>     David> I will take into account all those helpful messages, and
>     David> hopefully come with something for the end of the week :),
> 
>     David> cheers
> 
>     David> David _______________________________________________
>     David> Numpy-discussion mailing list Numpy-discussion at scipy.org
>     David> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion



More information about the Numpy-discussion mailing list