[Numpy-discussion] In-place fancy selection

Francesc Altet faltet@carabos....
Thu Mar 1 14:59:58 CST 2007


El dj 01 de 03 del 2007 a les 13:40 -0700, en/na Charles R Harris va
escriure:
> 
> 
> On 3/1/07, Francesc Altet <faltet@carabos.com> wrote:
>         Hi,
>         
>         I don't think there is a solution for this, but perhaps
>         anybody may
>         offer some idea. Given:
>         
>         In [79]:a=numpy.arange(9,-1,-1)
>         In [80]:b=numpy.arange(10)
>         In [81]:numpy.random.shuffle(b)
>         In [82]:b 
>         Out[82]:array([2, 6, 3, 5, 4, 9, 0, 8, 7, 1])
>         In [83]:a=a[b]
>         In [84]:a
>         Out[84]:array([7, 3, 6, 4, 5, 0, 9, 1, 2, 8])
>         
>         is there a way to make the step 83 without having to keep 3
>         arrays
>         in-memory at the same time? This is, some way of doing fancy
>         indexing, 
>         but changing the elements *inplace*. The idea is to keep
>         memory
>         requeriments as low as possible when a and b are large arrays.
>         
>         Thanks!
> 
> You can also put the arrays together and implement it as an inplace
> sort, which will save space at the price of n*log(n) operations. The
> idea is to sort on the shuffled array while carrying the corresponding
> elements of the other array along in the exchanges, which I think you
> can now do using fields and the order keyword in the sort. 

Nice idea! I think your approach is going to work:

In [18]:a=numpy.arange(9.,-1,-1)
In [19]:b=numpy.arange(10)
In [20]:numpy.random.shuffle(b)
In [21]:c=numpy.rec.fromarrays([a,b], dtype='i4,i4')
In [22]:c
Out[22]:
recarray([(9, 1), (8, 6), (7, 9), (6, 5), (5, 3), (4, 4), (3, 7), (2,
8),
       (1, 0), (0, 2)],
      dtype=[('f0', '<i4'), ('f1', '<i4')])
In [23]:c.sort(order='f0')
In [24]:c
Out[24]:
recarray([(0, 2), (1, 0), (2, 8), (3, 7), (4, 4), (5, 3), (6, 5), (7,
9),
       (8, 6), (9, 1)],
      dtype=[('f0', '<i4'), ('f1', '<i4')])
In [25]:c['f0']
Out[25]:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [26]:c['f1']
Out[26]:array([2, 0, 8, 7, 4, 3, 5, 9, 6, 1])

Tomorrow I'll do some timings and will check that memory consumption is
lower than with my current approach, but my guts are having a good feel.

Many thanks!

> 
-- 
Francesc Altet    |  Be careful about using the following code --
Carabos Coop. V.  |  I've only proven that it works, 
www.carabos.com   |  I haven't tested it. -- Donald Knuth



More information about the Numpy-discussion mailing list