[Numpy-discussion] Slow performance in array protocol with string arrays
Todd Miller
jmiller at stsci.edu
Wed Jan 4 04:23:18 CST 2006
Francesc Altet wrote:
>Hi,
>
>Perhaps this is not very important because it only has effects at high
>dimensionality, but I think it would be good to send it here for the
>records.
>
>It seems that numarray implementation for the array protocol in string
>arrays is very slow for dimensionality > 10:
>
>In [258]: a=scicore.reshape(scicore.array((1,)), (1,)*15)
>
>In [259]: a
>Out[259]: array([[[[[[[[[[[[[[[1]]]]]]]]]]]]]]])
>
>In [260]: t1=time(); c=numarray.array(a);print time()-t1
>0.000355958938599 # numerical conversion is pretty fast: 0.3 ms
>
>In [261]: b=scicore.array(a, dtype="S1")
>
>In [262]: b
>Out[262]: array([[[[[[[[[[[[[[[1]]]]]]]]]]]]]]], dtype=(string,1))
>
>In [263]: t1=time(); c=numarray.strings.array(b);print time()-t1
>0.61981511116 # string conversion is more than 1000x slower
>
>In [264]: t1=time(); d=scicore.array(c);print time()-t1
>0.000162839889526 # scipy_core speed seems normal
>
>In [266]: t1=time(); d=numarray.strings.array(c);print time()-t1
>1.38820910454 # converting numarray strings into themselves is
> # the slowest!
>
>Using numarray 1.5.0 and scipy_core 0.9.2.1763.
>
>Cheers
>
>
I logged this on Source Forge with the growing collection of
numarray.strings issues. For now, strings.array() isn't taking
advantage of the new array protocol and is implemented largely in Python.
Todd
More information about the Numpy-discussion
mailing list