[SciPy-user] How to "move" data within an array

Andrew Smart subscriptions@smart-knowhow...
Wed Jul 4 12:14:27 CDT 2007


Hi Anne,

thanks for your very helpful input.

Currently I don't have any slowdowns since I'm working on the first version
of the software. I just tried to prevent mistakes by design. I already know
that I will have very large arrays - and since I'm new to numpy I thoughed
"ask the pro's". Better ask than to have to refactor the software later
because I took the wrong approach.

Numpy is definitivly a very sophisticated package with lots of power - but
this implies also that there may be facts very important for good
performance "hidden" below.

I'll take the roll() approach and test it with lots of data. The target
plattform is primarily Linux/Unix, but I'll have to make showcases on
Windows too... We'll see.

Thanks a lot,
Andrew


> -----Ursprüngliche Nachricht-----
> Von: scipy-user-bounces@scipy.org 
> [mailto:scipy-user-bounces@scipy.org] Im Auftrag von Anne Archibald
> Gesendet: Mittwoch, 4. Juli 2007 17:50
> An: SciPy Users List
> Betreff: Re: [SciPy-user] How to "move" data within an array
> 
> On 04/07/07, Andrew Smart <subscriptions@smart-knowhow.de> wrote:
> 
> > I found roll() in the meantime also as one option. It's a feasible 
> > approach, but still may cause memory fragmentation. I'll 
> take roll() 
> > for the time being - and some time later I'll see if 
> someone makes me 
> > a C based function which does the same without copying the array...
> 
> I don't think memory fragmentation should be a concern. For 
> one thing, as you've described the problem, the array size 
> isn't changing, so it will almost certainly be copied back 
> and forth between two memory blocks. For another, for an 
> array big enough for you to care about,
> malloc() will request a new hunk of memory from the OS, and 
> then free it back to the OS when you're done with it. (Well, 
> on Linux anyway.)
> 
> If you do want to move the array in place, you could try 
> A[:-1]=A[1:] I don't know if that is smart enough to use 
> memmove() if the copy is of a contiguous block, but at the 
> least it will be done in place by C code. (The implementation 
> of that optimization is complicated by the semantics of that 
> operation - since source and destination overlap, the order 
> of the indexing matters.)
> 
> A true in-place roll() is tricky and might have very poor 
> cache behaviour (because roll() tries to preserve all the 
> array elements, so rolling a 7-element array only has to loop 
> around once, skipping by 3, while rolling a 9-element array 
> has to loop around nine times, skipping by three).
> 
> Memory allocation is cheap and efficient, so eliminating 
> temporaries is less helpful than it might seem. In 
> particular, it's worth knowing that numpy array memory is a 
> leaf with respect to the python garbage collector - it 
> doesn't need to be traversed, it just disappears when the 
> array objects pointing to it go. Also, as I said above, at 
> least on Linux, big monolithic allocations like numpy's go on 
> fresh pages from the OS, and are given back to it when done.
> 
> Is this actually the slowdown in your application?
> 
> Anne
> _______________________________________________
> SciPy-user mailing list
> SciPy-user@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
> 



More information about the SciPy-user mailing list