[Numpy-discussion] Make array uncopyable

Dag Sverre Seljebotn d.s.seljebotn@astro.uio...
Wed Mar 23 15:36:11 CDT 2011


On 03/23/2011 09:24 PM, Daniel Lepage wrote:
> On Wed, Mar 23, 2011 at 3:54 PM, Dag Sverre Seljebotn
> <d.s.seljebotn@astro.uio.no>  wrote:
>> On 03/23/2011 08:05 PM, Daniel Lepage wrote:
>>> Hi all,
>>>      Is there a way to mark an array as uncopyable? If you have very
>>> large data arrays some numpy functions will cause MemoryErrors because
>>> behind the scenes they e.g. transpose the arrays into Fortran order to
>>> feed them into Fortran libraries (numpy.linalg.svd is an example of
>>> this). It would be great if there were a way to mark an array as "too
>>> big to copy" so that an attempt to call transpose() or astype() would
>>> raise an exception immediately instead of clobbering the machine's
>>> memory first, but I don't know of any flag that does this.
>>>
>>> I suppose I could always subclass ndarray and redefine transpose(),
>>> astype(), etc., but it'd be nice if there were an easier way.
>> This is a bit OT, but if your problem is simply wanting to fail hard
>> when using too much memory instead of swapping to disk, then on Unix
>> systems you can use "ulimit -v" to limit how much memory your
>> application can use. When I do this the MemoryError is quick and painless.
> Ah, I'd completely forgotten that I could do that. That'll help a lot, thanks!
>
> That said, it'd still be cool if there were a way to have numpy warn
> me of these things, so that I tell the difference between "I ran out
> of memory because I did something stupid" and "I ran out of memory
> because numpy was doubling things behind the scenes".
>
> What would be *really* cool is if I could get error messages telling
> me why something wanted to copy my array (e.g.
> UncopyableException("This function requires Fortran-style arrays")) so
> that I'd know what I'd need to do to my arrays if I wanted the
> function to work, but I'm sure applying that to all of numpy would be
> a gargantuan undertaking.

Well, most such copies happen in the same NumPy C API function, so a 
global flag there would go a long way (put perhaps disable the 
possibility of explicit copying as well...). But I don't think it is 
implemented. And yes, doing it in any nice way would likely be a major 
effort.

For a quicker workaround that doesn't touch NumPy, you could also 
implement a context manager in Cython that temporarily sets the ulimit 
("man setrlimit", and use the soft limits), and wrap your calls to NumPy 
in that.

Dag Sverre


More information about the NumPy-Discussion mailing list