[NumPy-Tickets] [NumPy] #1450: Patch with Ziggurat method for Normal distribution

NumPy Trac numpy-tickets@scipy....
Fri Apr 9 21:30:38 CDT 2010


#1450: Patch with Ziggurat method for Normal distribution
---------------------------------+------------------------------------------
 Reporter:  ilan                 |       Owner:  somebody
     Type:  enhancement          |      Status:  new     
 Priority:  normal               |   Milestone:          
Component:  numpy.random         |     Version:  1.4.0   
 Keywords:  normal distribution  |  
---------------------------------+------------------------------------------

Comment(by charris):

 We've had better ziggurat methods than those in Doornik for some years
 now, but there has been little push in getting them into numpy.


 {{{
 Hi there.

 I've completed timing and robustness tests that should
 allow apples to apples comparisons of the mtrand normal distribution
 possibilities.  The short story goes like this:

  0) One of the 'Harris' ziggurats, zig7, passes all TestU01 crush tests
    when running atop each of five PRNGS.

  1) Depending on how fast the underlying PRNG is, zig7
    outperforms Box_Muller by ~1.3X to ~8.5X.

  2) using gcc, mwc8222+zig7 outperforms mt19937+Box_Muller by 7X.

 See you both at SciPy.

 Bruce


 And here are the details:

 In what follows the uniform variate generators are:
  lcg64
  mwc8222
  mt19937
  mt19937_64
  yarn5

 And the normal distribution codes are:
  trng  - default normal distribution code in TRNG
  boxm  - Box-Muller, mtrand lookalike, remembers/uses 2nd value
  zig7  - a 'Harris' ziggurat indexed by 7 bits
  zig8  - a 'Harris' ziggurat indexed by 8 bits
  zig9  - a 'Harris' ziggurat indexed by 9 bits

 Here are the numbers in more detail:

 # Timings from icc -O2 running on 2.4GhZ Core-2

 lcg64 trng: 6.52459e+06 ops per second
 lcg64 boxm: 2.18453e+07 ops per second
 lcg64 zig7: 1.80616e+08 ops per second
 lcg64 zig8: 2.01865e+08 ops per second
 lcg64 zig9: 2.05156e+08 ops per second

 mwc8222 trng: 6.52459e+06 ops per second
 mwc8222 boxm: 2.08787e+07 ops per second
 mwc8222 zig7: 9.44663e+07 ops per second
 mwc8222 zig8: 1.05326e+08 ops per second
 mwc8222 zig9: 1.03478e+08 ops per second

 mt19937 trng: 6.41112e+06 ops per second
 mt19937 boxm: 1.64986e+07 ops per second
 mt19937 zig7: 4.23762e+07 ops per second
 mt19937 zig8: 4.52623e+07 ops per second
 mt19937 zig9: 4.52623e+07 ops per second

 mt19937_64 trng: 6.42509e+06 ops per second
 mt19937_64 boxm: 1.93226e+07 ops per second
 mt19937_64 zig7: 5.8762e+07 ops per second
 mt19937_64 zig8: 6.17213e+07 ops per second
 mt19937_64 zig9: 6.29146e+07 ops per second

 yarn5 trng: 5.95781e+06 ops per second
 yarn5 boxm: 1.19156e+07 ops per second
 yarn5 zig7: 1.48945e+07 ops per second
 yarn5 zig8: 1.54809e+07 ops per second
 yarn5 zig9: 1.53201e+07 ops per second


 # Timings from g++ -O2 running on a 2.4GhZ Core-2

 lcg64 trng: 6.72163e+06 ops per second
 lcg64 boxm: 1.50465e+07 ops per second
 lcg64 zig7: 1.31072e+08 ops per second
 lcg64 zig8: 1.48383e+08 ops per second
 lcg64 zig9: 1.6036e+08 ops per second

 mwc8222 trng: 6.64215e+06 ops per second
 mwc8222 boxm: 1.44299e+07 ops per second
 mwc8222 zig7: 8.903e+07 ops per second
 mwc8222 zig8: 1.00825e+08 ops per second
 mwc8222 zig9: 1.03478e+08 ops per second

 mt19937 trng: 6.52459e+06 ops per second
 mt19937 boxm: 1.28223e+07 ops per second
 mt19937 zig7: 5.00116e+07 ops per second
 mt19937 zig8: 5.41123e+07 ops per second
 mt19937 zig9: 5.47083e+07 ops per second

 mt19937_64 trng: 6.58285e+06 ops per second
 mt19937_64 boxm: 1.42988e+07 ops per second
 mt19937_64 zig7: 6.72164e+07 ops per second
 mt19937_64 zig8: 7.39591e+07 ops per second
 mt19937_64 zig9: 7.46022e+07 ops per second

 yarn5 trng: 6.25144e+06 ops per second
 yarn5 boxm: 8.93672e+06 ops per second
 yarn5 zig7: 1.50465e+07 ops per second
 yarn5 zig8: 1.57496e+07 ops per second
 yarn5 zig9: 1.56038e+07 ops per second
 }}}

 You should contact Bruce Carneal <bcarneal@gmail.com> and see if you can
 get his code. It is in c++ but can probably be translated to c without too
 much trouble. If we are going the ziggurat direction, it would be nice to
 implement the exponential distribution also. As you can see, the overall
 speedup also depends on the underlying rng, so if you can figure out a way
 of making choices available there that will also help. Of course, for
 small sets of values call overhead will dominate everything.

 Thanks for looking into this problem, all it needs is for someone to take
 it in hand and push it through.

-- 
Ticket URL: <http://projects.scipy.org/numpy/ticket/1450#comment:1>
NumPy <http://projects.scipy.org/numpy>
My example project


More information about the NumPy-Tickets mailing list