[IPython-user] Multi-processor access to a large data set

Gael Varoquaux gael.varoquaux@normalesup....
Thu Dec 17 13:13:18 CST 2009


On Thu, Dec 17, 2009 at 11:19:00AM -0700, Robert Ferrell wrote:
> The tasks all need read-only access to a large-ish data set (~ 1GB).   
> I don't want to replicate this data set 8 times.  How do I give each  
> engine read-only access.

My approach is to use multiprocessing
(http://docs.python.org/library/multiprocessing.html, new in 2.6 but
exists as a separae module for 2.5). If you are under a unix box, the
processes are spawned using fork. The memory pages are 'copy on write'
after the fork, which means that if you don't write to the arrays that
were created before the fork, they won't be copied.

Gaël


More information about the IPython-user mailing list