[SciPy-user] normalizing data/distributions

Robert Kern rkern at ucsd.edu
Wed May 12 21:57:57 CDT 2004


Scott Bray wrote:

> Hey everyone,
> 
> i am working on a statistics type project for university study. i have a 
> set of data, have built a discrete probability distribution from this 
> data (using the cauchy distribution) and now want to normalize it.

I'm sorry, but this doesn't make sense. How does one build a discrete 
probability distribution using the Cauchy distribution and the data? 
Cauchy is a continuous distribution. Do you mean parameterized? 
Histogrammed?

> currently, the area of the distribution is not equal to one. i have been 
> trying to find literature about how to normalize, but have been 
> unsuccessful (and what i have found, i am unsure on the validity). some 
> say to normalize the data points by:
> 
> (data point - sample mean) / sample std

The meaning of the word "normalization" varies with context. If you had 
reason to believe the data were Gaussian, this transformation reduces 
the data to a standard ("normal") form. Not what you want, I think.

> others say to multiply by a normalising constant that is "chosen" to 
> make the area equal to one. i tried this by just scaling the area to 
> equal one.

Sounds about right. I'm not sure what you're talking about, though (area 
of what exactly? how are you calculating this thing?).

> i'm sorry if this is unclear, but if you have done this sort of thing 
> before, i would REALLY appreciate some help. just ask me some questions 
> if needs be.

Sure thing, but let's take the statistics issues off-list into private 
email until we get to SciPy issues.

> Thanks
> Scott

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter



More information about the SciPy-user mailing list