[SciPy-User] Multiprocessing and shared memory

Felix Schlesinger schlesin@cshl....
Sun Oct 18 20:11:50 CDT 2009


> Robin skrev:
>> I'm probably wrong - but if it is really only read only access you
>> need to an array can't you just put it as a module variable before the
>> fork - then all the workers can access it and as long as they don't
>> touch it it shouldn't make a copy.
>>
>>
> If you have a copy-on-write optimized fork (e.g. modern linux kernels),
> yes, pages that are never written to are never copied.

That does work if one is careful never to create any new reference to
the shared array to or modify it in any other implicit way in the
worker process. The problem is that a modification will not cause an
error, but simply a copy (i.e. silent memory leak).

In summary it seems to be that the memmap approach is fine for shared
data already available at fork, but one has to close the underlying
filehandle explicitly after all workers have quit.
Multicore numpy sounds interesting too. Does anyone know what the
state of that is?
Or if available the Intel MKL (but that is neither open nor free) and
at that point its hard to know what parts will use parallel
processing.

Felix


More information about the SciPy-User mailing list