[Numpy-discussion] sample without replacement
John Salvatier
jsalvati@u.washington....
Mon Dec 20 11:13:21 CST 2010
I think this is not possible to do efficiently with just numpy. If you want
to do this efficiently, I wrote a no-replacement sampler in Cython some time
ago (below). I hearby release it to the public domain.
'''
Created on Oct 24, 2009
http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement
@author: johnsalvatier
'''
from __future__ import division
import numpy
def random_no_replace(sampleSize, populationSize, numSamples):
samples = numpy.zeros((numSamples, sampleSize),dtype=int)
# Use Knuth's variable names
cdef int n = sampleSize
cdef int N = populationSize
cdef i = 0
cdef int t = 0 # total input records dealt with
cdef int m = 0 # number of items selected so far
cdef double u
while i < numSamples:
t = 0
m = 0
while m < n :
u = numpy.random.uniform() # call a uniform(0,1) random number
generator
if (N - t)*u >= n - m :
t += 1
else:
samples[i,m] = t
t += 1
m += 1
i += 1
return samples
On Mon, Dec 20, 2010 at 8:28 AM, Alan G Isaac <alan.isaac@gmail.com> wrote:
> I want to sample *without* replacement from a vector
> (as with Python's random.sample). I don't see a direct
> replacement for this, and I don't want to carry two
> PRNG's around. Is the best way something like this?
>
> permutation(myvector)[:samplesize]
>
> Thanks,
> Alan Isaac
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101220/59ecd2e5/attachment.html
More information about the NumPy-Discussion
mailing list