[Numpy-discussion] Adopt Mersenne Twister 64bit?
Mon Mar 11 04:46:54 CDT 2013
On Sun, Mar 10, 2013 at 6:12 PM, Siu Kwan Lam <email@example.com> wrote:
> Hi all,
> I am redirecting a discussion on github issue tracker here. My original
> post (https://github.com/numpy/numpy/issues/3137):
> "The current implementation of the RNG seems to be MT19937-32. Since 64-bit
> machines are common nowadays, I am suggesting adding or upgrading to
> MT19937-64. Thoughts?"
> Let me start by answering to njsmith's comments on the issue tracker:
> Would it be faster?
> Although I have not benchmarked the 64-bit implementation, it is likely that
> it will be faster on a 64-bit machine since the number of iteration
> (controlled by NN and MM in the reference implementation
> is reduced by half. In addition, each generation in the 64-bit
> implementation produces a 64-bit random int which can be used to generate
> double precision random number. Unlike the 32-bit implementation which
> requires generating a pair of 32-bit random int.
>From the last time this was brought up, it looks like getting a single
64-bit integer out from MT19937-64 takes about the same amount of time
as getting a single 32-bit integer from MT19937-32, perhaps a little
slower, even on a 64-bit machine.
So getting a single double would be not quite twice as fast.
> But, on a 32-bit machine, a 64-bit instruction is translated into 4 32-bit
> instructions; thus, it is likely to be slower. (1)
> Use less memory?
> The amount of memory use will remain the same. The size of the RNG state is
> the same.
> Provide higher quality randomness?
> My naive answer is that 32-bit and 64-bit implementation have the same
> 2^19937-1 period. Need to do some research and experiments.
> Would it change the output of this program: import numpy
> numpy.random.seed(0) print numpy.random.random() ?
> Unfortunately, yes. The 64-bit implementation generates a different random
> number sequence with the same seed. (2)
> My suggestion to overcome (1) and (2) is to allow the user to select between
> the two implementations (and possibly different algorithms in the future).
> If user does not provide a choice, we use the MT19937-32 by default.
> numpy.random.set_state("MT19937_64", …) # choose the 64-bit
Most likely, the different PRNGs should be different subclasses of
RandomState. The module-level convenience API should probably be left
alone. If you need to control the PRNG that you are using, you really
need to be passing around a RandomState instance and not relying on
reseeding the shared global instance. Aside: I really wish we hadn't
exposed `set_state()` in the module API. It's an attractive nuisance.
There is some low-level C work that needs to be done to allow the
non-uniform distributions to be shared between implementations of the
core uniform PRNG, but that's the same no matter how you organize the
More information about the NumPy-Discussion