[Numpy-discussion] trying to speed up the following....
Brennan Williams
brennan.williams@visualreservoir....
Wed Mar 25 00:09:09 CDT 2009
Robert Kern wrote:
> On Tue, Mar 24, 2009 at 18:29, Brennan Williams
> <brennan.williams@visualreservoir.com> wrote:
>
>> I have an array (porvatt.yarray) of ni*nj*nk values.
>> I want to create two further arrays.
>>
>> activeatt.yarray is of size ni*nj*nk and is a pointer array to an active
>> cell number. If a cell is inactive then its activeatt.yarray value will be 0
>>
>> ijkatt.yarray is of size nactive, the number of active cells (which I
>> already know). ijkatt.yarray holds the ijk cell number for each active cell.
>>
>>
>> My code looks something like...
>>
>> activeatt.yarray=zeros(ncells,dtype=int)
>> ijkatt.yarray=zeros(nactivecells,dtype=int)
>>
>> iactive=-1
>> ni=currentgrid.ni
>> nj=currentgrid.nj
>> nk=currentgrid.nk
>> for ijk in range(0,ni*nj*nk):
>> if porvatt.yarray[ijk]>0:
>> iactive+=1
>> activeatt.yarray[ijk]=iactive
>> ijkatt.yarray[iactive]=ijk
>>
>> I may often have a million+ cells.
>> So the code above is slow.
>> How can I speed it up?
>>
>
> mask = (porvatt.yarray.flat > 0)
> ijkatt.yarray = np.nonzero(mask)
>
> # This is not what your code does, but what I think you want.
> # Where porvatt.yarray is inactive, activeatt.yarray is -1.
> # 0 might be an active cell.
> activeatt.yarray = np.empty(ncells, dtype=int)
> activeatt.yarray.fill(-1)
> activeatt.yarray[mask] = ijkatt.yarray
>
>
>
Thanks. Concise & fast. This is what I've got so far (minor mods from
the above)....
from numpy import *
...
mask=porvatt.yarray>0.0
ijkatt.yarray=nonzero(mask)[0]
activeindices=arange(0,ijkatt.yarray.size)
activeatt.yarray = empty(ncells, dtype=int)
activeatt.yarray.fill(-1)
activeatt.yarray[mask] = activeindices
I have...
ijkatt.yarray=nonzero(mask)[0]
because it looks like nonzero returns a tuple of arrays rather than an
array.
I used
activeindices=arange(0,ijkatt.yarray.size)
and
activeatt.yarray[mask] = activeindices
as I have 686000 cells of which 129881 are 'active' so my
activeatt.yarray values range from -1 for inactive
through 0 for the first active cell up to 129880 for the last active cell.
About to test it out by replacing my old for loop. Looks like it will be
about 20x faster for 1m cells.
Brennan
More information about the Numpy-discussion
mailing list