[Numpy-discussion] custom allocation of numpy array

Robert Kern robert.kern@gmail....
Tue Jul 7 19:54:40 CDT 2009


On Tue, Jul 7, 2009 at 17:50, Trevor Clarke<trevor@notcows.com> wrote:
> I'm embedding python and numpy in a C++ program and I want to share some
> data owned by C++. I'm able to get the allocation/deallocation and memory
> sharing working for a contiguous array. However, I have non-contiguous data
> (random sub-array locations, not a fixed skip factor) and I may not have all
> of the data in memory at any given time. The data is quite large (200gb is
> not uncommon) so I need to load and access on-demand. I've already got a
> paging system in-place which obtains a contiguous sub-array on-demand and
> I'd like to use this with numpy. I've looked at the memmap array code and
> that's not sufficient as the files are compressed so a simple memmap won't
> work.
>
> My initial thought is to tell numpy that the entire array is available and
> intercept the access requests and load as necessary.

This is not possible without special support from the virtual memory
system like mmap. numpy's memory model requires that the data can be
described by a valid pointer for the starting location and the shape
and strides. This is a very regular structure. You cannot have
multiple, arbitrarily-distributed "starting pointers".

You will have to deal with the sub-arrays as sub-arrays.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco


More information about the NumPy-Discussion mailing list