[Numpy-discussion] numpy.random and multiprocessing

David Cournapeau david@ar.media.kyoto-u.ac...
Thu Dec 11 11:29:55 CST 2008


Sturla Molden wrote:
> On 12/11/2008 6:10 PM, Michael Gilbert wrote:
>
>   
>> Shouldn't numpy (and/or multiprocessing) be smart enough to prevent
>> this kind of error?  A simple enough solution would be to also include
>> the process id as part of the seed 
>>     
>
> It would not help, as the seeding is done prior to forking.
>
> I am mostly familiar with Windows programming. But what is needed is a 
> fork handler (similar to a system hook in Windows jargon) that sets a 
> new seed in the child process.
>
> Could pthread_atfork be used?
>   

The seed could be explicitly set in each task, no ?

def task(x):
    np.random.seed()
    return np.random.random(x)

But does this really make sense ?

Is the goal to parallelize a big sampler into N tasks of M trials, to
produce the same result as a sequential set of M*N trials ? Then it does
sound like a trivial task at all. I know there exists libraries 
explicitly designed for parallel random number generation - maybe this
is where we should look, instead of using heuristics which are likely to
be bogus, and generate wrong results.

cheers,

David


More information about the Numpy-discussion mailing list