[Numpy-discussion] numpy.random and multiprocessing

Gael Varoquaux gael.varoquaux@normalesup....
Thu Dec 11 10:59:14 CST 2008


On Thu, Dec 11, 2008 at 05:55:58PM +0100, Sturla Molden wrote:
> > No, Pool is what I want, because in my production code I am submitting
> > jobs to that pool.

> Sure, a pool is fine. I was just speculating that one of the four 
> processes in your pool was idle all the time; i.e. that one of the other 
> three got to do the task twice. Therefore you only got three identical 
> results and not four. It depends on how the OS schedules the processes, 
> the number of logical CPUs, etc. You have no control over that. But if 
> you had used N instances of multiprocessing.Pool instead, all N results 
> should have been identical (if the 'random' generator is completely 
> deterministic) - because each process would do the task once.

> I.e. you only got three indentical results due to a race condition in 
> the task queue.

Gotcha! Good explanation. Now I understand better my previous
investigation. I think you are completely right.

So indeed, as I initialy thought, using multiprocessing without reseeding
is going to get you in big trouble (and this is what I experienced in my
code).

Thanks for the explanation,

Gaël


More information about the Numpy-discussion mailing list