[Numpy-discussion] trying to speed up the following....

Brennan Williams brennan.williams@visualreservoir....
Wed Mar 25 00:09:09 CDT 2009


Robert Kern wrote:
> On Tue, Mar 24, 2009 at 18:29, Brennan Williams
> <brennan.williams@visualreservoir.com> wrote:
>   
>> I have an array (porvatt.yarray) of ni*nj*nk values.
>> I want to create two further arrays.
>>
>> activeatt.yarray is of size ni*nj*nk and is a pointer array to an active
>> cell number. If a cell is inactive then its activeatt.yarray value will be 0
>>
>> ijkatt.yarray is of size nactive, the number of active cells (which I
>> already know). ijkatt.yarray holds the ijk cell number for each active cell.
>>
>>
>> My code looks something like...
>>
>>           activeatt.yarray=zeros(ncells,dtype=int)
>>           ijkatt.yarray=zeros(nactivecells,dtype=int)
>>
>>            iactive=-1
>>            ni=currentgrid.ni
>>            nj=currentgrid.nj
>>            nk=currentgrid.nk
>>            for ijk in range(0,ni*nj*nk):
>>              if porvatt.yarray[ijk]>0:
>>                iactive+=1
>>                activeatt.yarray[ijk]=iactive
>>                ijkatt.yarray[iactive]=ijk
>>
>> I may often have a million+ cells.
>> So the code above is slow.
>> How can I speed it up?
>>     
>
> mask = (porvatt.yarray.flat > 0)
> ijkatt.yarray = np.nonzero(mask)
>
> # This is not what your code does, but what I think you want.
> # Where porvatt.yarray is inactive, activeatt.yarray is -1.
> # 0 might be an active cell.
> activeatt.yarray = np.empty(ncells, dtype=int)
> activeatt.yarray.fill(-1)
> activeatt.yarray[mask] = ijkatt.yarray
>
>
>   
Thanks. Concise & fast. This is what I've got so far (minor mods from 
the above)....

from numpy import *
...
mask=porvatt.yarray>0.0
ijkatt.yarray=nonzero(mask)[0]
activeindices=arange(0,ijkatt.yarray.size)
activeatt.yarray = empty(ncells, dtype=int)
activeatt.yarray.fill(-1)
activeatt.yarray[mask] = activeindices

I have...

ijkatt.yarray=nonzero(mask)[0]

because it looks like nonzero returns a tuple of arrays rather than an 
array.

I used

activeindices=arange(0,ijkatt.yarray.size)

and

activeatt.yarray[mask] = activeindices

as I have 686000 cells of which 129881 are 'active' so my 
activeatt.yarray values range from -1 for inactive
through 0 for the first active cell up to 129880 for the last active cell.

About to test it out by replacing my old for loop. Looks like it will be 
about 20x faster for 1m cells.


Brennan



More information about the Numpy-discussion mailing list