[Numpy-discussion] Slow performance in array protocol with string arrays

Todd Miller jmiller at stsci.edu
Wed Jan 4 04:23:18 CST 2006


Francesc Altet wrote:

>Hi,
>
>Perhaps this is not very important because it only has effects at high
>dimensionality, but I think it would be good to send it here for the
>records.
>
>It seems that numarray implementation for the array protocol in string
>arrays is very slow for dimensionality > 10:
>
>In [258]: a=scicore.reshape(scicore.array((1,)), (1,)*15)
>
>In [259]: a
>Out[259]: array([[[[[[[[[[[[[[[1]]]]]]]]]]]]]]])
>
>In [260]: t1=time(); c=numarray.array(a);print time()-t1
>0.000355958938599   # numerical conversion is pretty fast: 0.3 ms
>
>In [261]: b=scicore.array(a, dtype="S1")
>
>In [262]: b
>Out[262]: array([[[[[[[[[[[[[[[1]]]]]]]]]]]]]]], dtype=(string,1))
>
>In [263]: t1=time(); c=numarray.strings.array(b);print time()-t1
>0.61981511116      # string conversion is more than 1000x slower
>
>In [264]: t1=time(); d=scicore.array(c);print time()-t1
>0.000162839889526  # scipy_core speed seems normal
>
>In [266]: t1=time(); d=numarray.strings.array(c);print time()-t1
>1.38820910454      # converting numarray strings into themselves is
>                   # the slowest!
>
>Using numarray 1.5.0 and scipy_core 0.9.2.1763.
>
>Cheers
>  
>
I logged this on Source Forge with the growing collection of 
numarray.strings issues.   For now,  strings.array() isn't taking 
advantage of the new array protocol and is implemented largely in Python.

Todd




More information about the Numpy-discussion mailing list