[SciPy-user] shared memory machines

Philip Semanchuk philip@semanchuk....
Mon Feb 9 09:28:44 CST 2009

On Feb 9, 2009, at 3:23 AM, Gael Varoquaux wrote:

> On Mon, Feb 09, 2009 at 07:15:11AM +0100, Gael Varoquaux wrote:
>> It really was. Thanks a lot. I need to do a few more checks, but I
>> believe I have a first version of some code sharing arrays by name.
> OK, I have a first working version under Unix (attached, with trivial
> test case).
> Now we need to make it so that the ndarray can be used in the
> mutliprocessing function call, rather than the buffer object. In other
> words we need to create an object that behaves as an ndarray, but
> implements a different pickling method. What do people suggest as a  
> best
> approach here? Subclassing ndarray?

I notice that the size of the shared memory segment is set to "pages"  
* PAGESIZE. Who determines the value of "pages"? And what happens if  
the numpy object you're storing in the segment grows beyond that size?  
AFAIK ftruncate() can only be called *once* to resize the segment.  
That's true on OS X, anyway, so it's probably true elsewhere.

I once wrote some code to implement a shared dict using shared memory,  
and this was a problem I ran into. What happens when an item grows?  
The solution I eventually developed was to have one shared memory  
segment for metadata and a collection of other shared memory segments  
to hold the actual data. The metadata segment stored a (pickled) free  
space map and if a request was made to store an item that was larger  
than any free space I had, I'd allocate a new segment of the  
appropriate size. Otherwise, I'd stick it in the smallest piece of  
free space that it would fit into in an existing segment.

You can perhaps see where this is leading -- once one is tracking free  
space slots and so forth, one needs to think about memory compaction,  
too, because sooner or later items will get deleted from the dict and  
if nothing new is inserted all of that free space is sitting around  
going to waste.

Also, is it consistent with your license to use code from Python  
itself? If so, then I have another minor suggestion.


More information about the SciPy-user mailing list