[Numpy-discussion] Pull Request Review: R-like sample function
Thu Sep 1 21:01:36 CDT 2011
On Thu, Sep 1, 2011 at 6:02 PM, Christopher Jordan-Squire
> Hi--I've just submitted a numpy 2.0 pull request for a function sample
> in np.random. It's essentially an implementation of R's sample
> function. It allows possibly non-uniform, possibly without-replacement
> sampling from a given 1-D array-like. This is very useful for quickly
> and cleanly creating samples from, for example, a list of strings or a
> list of non-contiguous, non-evenly spaced integers. Both occur in data
> analysis with categorical data.
> It is, essentially, a convenience function that wraps a number of
> existing ways to take a random sample. I think it belongs in
> numpy.random rather than scipy.stats because it's just a random
> sampler, rather than a probability distribution. It isn't possible to
> define a scipy.stats discrete random variable on strings--it would
> have to instead be done on the indices of the list containing the
> possible samples. And (as far as I can tell) the scipy.stats
> distributions can't be used for sampling without replacement.
I don't think you can kill numpy.random.random and similar mixed in
with an adding a new function commit.
First these functions would need to be deprecated.
"it does not break the API as the previous function was not in the docs"
This is a doc bug, I assume. I don't think it means users/developers
don't rely on it.
searching for np.random.random shows 120 threads in my gmail reader,
python uses random.random()
dir(np.random) shows it
I copied it from mailing list examples. It's used quite a bit in
scipy, as I saw because of your work.
I also find the historical multiplicity of aliases confusing, but
which names should be deprecated would at least require a discussion
and a separate commit.
> -Chris JS
> NumPy-Discussion mailing list
More information about the NumPy-Discussion