[Numpy-discussion] In-place fancy selection
Francesc Altet
faltet@carabos....
Thu Mar 1 14:44:19 CST 2007
El dj 01 de 03 del 2007 a les 13:26 -0700, en/na Charles R Harris va
escriure:
>
>
> On 3/1/07, Francesc Altet <faltet@carabos.com> wrote:
> Hi,
>
> I don't think there is a solution for this, but perhaps
> anybody may
> offer some idea. Given:
>
> In [79]:a=numpy.arange(9,-1,-1)
> In [80]:b=numpy.arange(10)
> In [81]:numpy.random.shuffle(b)
> In [82]:b
> Out[82]:array([2, 6, 3, 5, 4, 9, 0, 8, 7, 1])
> In [83]:a=a[b]
> In [84]:a
> Out[84]:array([7, 3, 6, 4, 5, 0, 9, 1, 2, 8])
>
> is there a way to make the step 83 without having to keep 3
> arrays
> in-memory at the same time? This is, some way of doing fancy
> indexing,
> but changing the elements *inplace*. The idea is to keep
> memory
> requeriments as low as possible when a and b are large arrays.
>
> Thanks!
>
> I think that would be tough because of overlap between the two sides.
> The permutation could be factored into cycles which would mostly avoid
> that, but that is more theoretical than practical here. What is it you
> are trying to do?
Yeah, the problem is the overlap. Well, what I'm trying to do is, given
two arrays on-disk (say, block and block_idx), sort one of them, and
then, re-order the other following with the same order than the first
one. My best approach until now is:
block = tmp_sorted[nslice] # read block from disk
sblock_idx = block.argsort()
block.sort()
# do things with block...
del block # get rid of block
block_idx = tmp_indices[nslice] # read bock_idx from disk
indices[nslice] = block_idx[sblock_idx]
but the last line will take 3 times the memory that takes block_idx
alone. My goal would be that the algorithm above would take only 2 times
the memory of block_idx, but I don't think this is going to be possible.
Thanks!
--
Francesc Altet | Be careful about using the following code --
Carabos Coop. V. | I've only proven that it works,
www.carabos.com | I haven't tested it. -- Donald Knuth
More information about the Numpy-discussion
mailing list