[Numpy-discussion] Statistical distributions on samples

Andrea Gavana andrea.gavana@gmail....
Fri Aug 12 08:32:05 CDT 2011

Hi All,

    I am working on something that appeared to be a no-brainer issue (at the
beginning), by my complete ignorance in statistics is overwhelming and I got

What I am trying to do can be summarized as follows

Let's assume that I have to generate a sample of a 1,000 values for a
variable (let's say, "velocity") using a normal distribution (but later I
will have to do it with log-normal, triangular and a couple of others). The
only thing I know about this velocity sample is the minimum and maximum
values (let's say 50 and 200 respectively) and, obviously for the normal
distribution (but not so for the other distributions), the mean value (125
in this case).

Now, I would like to generate this sample of 1,000 points, in which none of
the point has velocity smaller than 50 or bigger than 200, and the number of
samples close to the mean (125) should be higher than the number of samples
close to the minimum and the maximum, following some kind of normal

What I have tried up to now is summarized in the code below, but as you can
easily see, I don't really know what I am doing. I am open to every
suggestion, and I apologize for the dumbness of my question.

import numpy

from scipy import stats
import matplotlib.pyplot as plt

minval, maxval = 50.0, 250.0
x = numpy.linspace(minval, maxval, 500)

samp = stats.norm.rvs(size=len(x))
pdf = stats.norm.pdf(x)
cdf = stats.norm.cdf(x)
ppf = stats.norm.ppf(x)

ax1 = plt.subplot(2, 2, 1)
ax1.plot(range(len(x)), samp)

ax2 = plt.subplot(2, 2, 2)
ax2.plot(x, pdf)

ax3 = plt.subplot(2, 2, 3)
ax3.plot(x, cdf)

ax4 = plt.subplot(2, 2, 4)
ax4.plot(x, ppf)



"Imagination Is The Only Weapon In The War Against Reality."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20110812/3d5f6673/attachment.html 

More information about the NumPy-Discussion mailing list