[Numpy-discussion] question about index array behavior
Perry Greenfield
perry at stsci.edu
Fri Jan 13 11:39:01 CST 2006
On Jan 13, 2006, at 2:07 PM, Russel Howe wrote:
> In the session below, I expected the for loop and the index array to
> have the same behavior. Is this behavior by design? Is there some
> other way to get the behavior of the for loop? The loop is too slow
> for my application ( len(ar1) == 18000).
> Russel
This sort of usage of index arrays is always going to be a bit
confusing and this is a common example of that. Anytime you are using
repeated indices for index assignment, you are not going to get what
you would naively think. It's useful to think of what is going on in a
little more detail. Your use of index arrays is resulting in the
elements you selected generating a 10 element array which is added to
the random elements. Initially it is a 10 element array with all zero
elements, and after the addition, it equals the random array elements.
Then, the index assignment takes place. First, the first element of the
summed array is assigned to 0, then the second element of the summed
array is assigned to 0, and that is the problem. The summing is done
before the assignment. Generally the last index of a repeated set is
what is assigned as the final value.
It is possible to do what you want without a for loop, but perhaps not
as fast as it would be in C. One way to do it is to sort the indices in
increasing order, generate the corresponding selected value array and
then use accumulated sums to derive the sums corresponding to each
index. It's a bit complicated, but can be much faster than a for loop.
See example 3.7.4 to see the details of how this is done in our
tutorial: http://www.scipy.org/wikis/topical_software/Tutorial
Maybe someone has a more elegant, faster or clever way to do this that
I've overlooked. I've seen this come up enough that it may be useful to
provide a special function to make this easier to do.
Perry Greenfield
> Python 2.4.2 (#1, Nov 29 2005, 08:43:33)
> [GCC 4.0.1 (Apple Computer, Inc. build 5247)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from numarray import *
> >>> import numarray.random_array as ra
> >>> print libnumarray.__version__
> 1.5.0
> >>> ar1=ra.random(10)
> >>> ar2=zeros(5, type=Float32)
> >>> ind=array([0,0,1,1,2,2,3,3,4,4])
> >>> ar2[ind]+=ar1
> >>> ar2
> array([ 0.09791247, 0.26159889, 0.89386773, 0.32572687,
> 0.86001897], type=Float32)
> >>> ar1
> array([ 0.49895534, 0.09791247, 0.424059 , 0.26159889, 0.29791802,
> 0.89386773, 0.44290054, 0.32572687, 0.53337622,
> 0.86001897])
> >>> ar2*=0.0
> >>> for x in xrange(len(ind)):
> ... ar2[ind[x]]+=ar1[x]
> ...
> >>> ar2
> array([ 0.5968678 , 0.68565786, 1.19178581, 0.76862741,
> 1.39339519], type=Float32)
> >>>
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log
> files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
More information about the Numpy-discussion
mailing list