[SciPy-user] random number sampling performance in newcore

Robert Kern rkern at ucsd.edu
Tue Nov 8 18:11:01 CST 2005

Chris Fonnesbeck wrote:
> I was surprised to find that the random number sampler in newcore is
> significantly slower than RandomArray in Numeric (at least for
> binomial sampling). A quick comparison showed the
> scipy.basic.random.binomial sampler to be over 8 times slower than
> RandomArray.binomial. I was surprised, since the newcore stuff is
> Pyrex-based (isnt it?).

It's a Pyrex wrapper to pure C code. See distributions.c for the actual
implementations of the distribution algorithms.

> Am I the only one observing such differences,
> or am I using the wrong method? If not, is the performance expected to
> improve significantly.

Can you show us the code you're using? Particularly, the parameters p
and n. I use two different algorithms for the binomial distribution, a
waiting time algorithm when p*n <= 30 (or (1-p)*n <= 30 if p > 0.5) and
the BTPE algorithm (the same algorithm that's in RANLIB, but
reimplemented) when p*n > 30 (or (1-p)*n ...).

In [10]: tra = Timer('x=RA.binomial(n,p,size)', 'import RandomArray as
RA; n=200; p=0.25; size=10000')

In [11]: tra.repeat(3,100)
Out[11]: [1.5208730697631836, 1.5047860145568848, 1.5198800563812256]

In [12]: tra = Timer('x=RA.binomial(n,p,size)', 'import RandomArray as
RA; n=20; p=0.25; size=10000')

In [13]: tra.repeat(3,100)
Out[13]: [0.89270305633544922, 0.90227413177490234, 1.183967113494873]

In [14]: tmt = Timer('x=random.binomial(n,p,size)', 'from scipy import
random; n=200; p=0.25; size=10000')

In [15]: tmt.repeat(3,100)
Out[15]: [1.922713041305542, 2.0582780838012695, 1.906635046005249]

In [16]: tmt = Timer('x=random.binomial(n,p,size)', 'from scipy import
random; n=20; p=0.25; size=10000')

In [17]: tmt.repeat(3,100)
Out[17]: [3.1456100940704346, 2.9999620914459229, 2.9954609870910645]

It looks like my implementation of BTPE is 30% slower than RANLIB and
the waiting time algorithm is 3 times slower than RANLIB's BTPE on my

Compare to normal():

In [21]: tra = Timer('x=RA.normal(0.0, 1.0, size)', 'import RandomArray
as RA; size=10000')
In [22]: tra.repeat(3,100)
Out[22]: [0.70471906661987305, 0.76926708221435547, 0.7034919261932373]

In [23]: tmt = Timer('x=random.normal(size=size)', 'from scipy import
random; size=10000')
In [24]: tmt.repeat(3,100)
Out[24]: [0.56431293487548828, 0.52272319793701172, 0.79108381271362305]

So the evidence suggests that the slowness is constrained to individual
distributions, one of them being binomial. I'll futz with it.

Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter

More information about the SciPy-user mailing list