[Numpy-discussion] PRNGs and multi-threading

Sturla Molden sturla@molden...
Fri Aug 21 19:48:24 CDT 2009

I am not sure if this is the right place to discuss this issue. However, 
I problem I keep running into in Monte Carlo simulations is generating 
pseudorandom numbers with multiple threads. PRNGs such as the Mersenne 
Twister keep an internal state, which prevents the PRNG from being 
re-entrant and thread-safe.

Possible solutions:

1. Use multiple instances of PRNG states (numpy.random.RandomState), one 
for each thread. This should give no contention, but is this 
mathematically acceptable? I don't know. At least I have not seen any 
proof that it is.

2. Protect the PRNG internally with a spinlock. In Windows lingo, that is:

#include <windows.h>
static volatile long spinlock = 0;
#define ACQUIRE_SPINLOCK while(InterlockedExchangeAcquire(&spinlock, 1));
#define RELEASE_SPINLOCK InterlockedExchangeAcquire(&spinlock, 0);

Problem: possible contention between thread, idle work if threads spins 
a lot.

3. Use a conventional mutex object to protect the PRNG (e.g. 
threading.Lock in Python or CRITICAL_SECTION in Windows).  Problem: 
contention, context shifting, and mutexes tend to be slow. Possibly the 
worst solution.

4. Put the PRNG in a dedicated thread, fill up rather big arrays with 
pseudo-random numbers, and write them to a queue. Problem: Context 
shifting unless a CPU is dedicated to this task. Unless producing random 
numbers consitutes a major portion of the simulation, this should not 
lead to much contention.

import threading
import Queue
import numpy
from numpy.random import rand

class PRNG(threading.Thread):

    ''' a thread that generate arrays with random numbers
        and dumps them to a queue '''

    def __init__(self, nthreads, shape):
        self.shape = shape
        self.count = numpy.prod(shape)
        self.queue = Queue.Queue(4 * nthreads) # magic number
        self.stop_evt = threading.Event()
    def generate(self, block=True, timeout=None):
        return self.queue.get(block,timeout)
    def join(self, timeout=None):
    def run(self):
        tmp = rand(count).reshape(shape)
        while 1:
                self.queue.put(tmp, block=True, timeout=2.0)
            except Queue.Full:
                if self.stop_evt.isSet(): break
                tmp = rand(count).reshape(shape)

Do you have any view on this? Is there any way of creating multiple 
independent random states that will work correctly? I know of SPRNG 
(Scalable PRNG), but it is made to work with MPI (which I don't use).

Sturla Molden

More information about the NumPy-Discussion mailing list