[Numpydiscussion] Numpydiscussion Digest, Vol 6, Issue 20
James A. Bednar
jbednar@inf.ed.ac...
Fri Mar 9 14:04:56 CST 2007
 Date: Fri, 9 Mar 2007 06:58:32 0800
 From: "Sebastian Haase" <haase@msg.ucsf.edu>
 Subject: Re: [Numpydiscussion] Numpydiscussion Digest, Vol 6, Issue 18
 To: "Discussion of Numerical Python" <numpydiscussion@scipy.org>

 On 3/9/07, James A. Bednar <jbednar@inf.ed.ac.uk> wrote:
 >  From: Robert Kern <robert.kern@gmail.com>
 >  Subject: Re: [Numpydiscussion] in place random generation
 > 
 >  Daniel Mahler wrote:
 >  > On 3/8/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
 > 
 >  >> Robert thought this might relate to Travis' changes adding
 >  >> broadcasting to the random number generator. It does seem
 >  >> certain that generating small arrays of random numbers has a
 >  >> very high overhead.
 >  >
 >  > Does that mean someone is working on fixing this?
 > 
 >  It's not on the top of my list, no.
 >
 > I just wanted to put in a vote saying that generating a large quantity
 > of small arrays of random numbers is quite important in my field, and
 > is something that is definitely slowing us down right now.
 >
 > We often simulate neural networks whose many, many small weight
 > matrices need to be initialized with random numbers, and we are seeing
 > quite slow startup times (on the order of minutes, even though
 > reloading a pickled snapshot of the same simulation once it has been
 > initialized takes only a few seconds).
 >
 > The quality of these particular random numbers doesn't matter very
 > much for us, so we are looking for some cheaper way to fill a bunch of
 > small matrices with at least passably random values. But it would of
 > course be better if the regular highquality random number support in
 > Numpy were speedy under these conditions...

 Hey Jim,

 Could you not create all the many arrays to use "one large chunck" of
 contiguous memory ?
 like: 1) create a large 1D array
 2) create all small arrays in a for loop using
 numpy.ndarray(buffer=largeArray[offset], shape=..., dtype=...) 
 you increment offset appropriately during the loop
 3) then you can reset all small arrays to new random numbers with one
 call to resetting the large array ((they all have the same statistics
 (mean,stddev, type), right ?
In principle, I *think* we could make that work. But we maintain a
large objectoriented toolkit for computational neuroscience (see
topographica.org), and we try to let each object take care of its own
business as much as possible, so that people can later swap things out
with their own customized versions. That's hard to do if we set up
global dependencies like this, and the results are quite difficult to
maintain.
Of course, we can and do put in optimizations for certain special
cases, but I suspect that it will be simpler in this case just to add
some fastanddirty but general way to fill small arrays with random
values. Still, it would be much simpler for us if the basic numpy
small random array support had less overhead...
Jim
More information about the Numpydiscussion
mailing list