# [SciPy-user] normalizing data/distributions

Robert Kern rkern at ucsd.edu
Wed May 12 21:57:57 CDT 2004

```Scott Bray wrote:

> Hey everyone,
>
> i am working on a statistics type project for university study. i have a
> set of data, have built a discrete probability distribution from this
> data (using the cauchy distribution) and now want to normalize it.

I'm sorry, but this doesn't make sense. How does one build a discrete
probability distribution using the Cauchy distribution and the data?
Cauchy is a continuous distribution. Do you mean parameterized?
Histogrammed?

> currently, the area of the distribution is not equal to one. i have been
> trying to find literature about how to normalize, but have been
> unsuccessful (and what i have found, i am unsure on the validity). some
> say to normalize the data points by:
>
> (data point - sample mean) / sample std

The meaning of the word "normalization" varies with context. If you had
reason to believe the data were Gaussian, this transformation reduces
the data to a standard ("normal") form. Not what you want, I think.

> others say to multiply by a normalising constant that is "chosen" to
> make the area equal to one. i tried this by just scaling the area to
> equal one.

Sounds about right. I'm not sure what you're talking about, though (area
of what exactly? how are you calculating this thing?).

> i'm sorry if this is unclear, but if you have done this sort of thing
> before, i would REALLY appreciate some help. just ask me some questions
> if needs be.

Sure thing, but let's take the statistics issues off-list into private
email until we get to SciPy issues.

> Thanks
> Scott

--
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

```