# [SciPy-user] CDF/PDF Stats with SciPy

Ivo Maljevic ivo.maljevic@gmail....
Mon Jul 20 18:12:01 CDT 2009

```I meant complementary CDF (CCDF=1-CDF).

2009/7/20 Ivo Maljevic <ivo.maljevic@gmail.com>

> I am not sure I quite understand what you are doing (the first criterion is
> the success of an experiment, and the second criterion is based on
> statistics of the first test?), but regardless of what you are doing, you
> can apply my_cdf() function I gave you to get the discrete CDF (or you can
> get the cumulative CDF as 1-CDF). To elaborate a little more on why I prefer
> this CDF approach. Quite often, r.v.'s have long tails, and they tend to
> disapper when you do numerical integration (cumsum is the most basic
> approach) on the estimated pdf (which is the histogram). When you use all
> the available data points (instead of just 50 or so), you get much better
> results.
>
> Once you find the CDF, you should be able the get your probabilities
> directly by reading off the plot values or by finding which Y-axis value
> (which is the probability) matches whatever bin you are interested in (on
> X-axis).
>
> I don't know if this helps. If not, and if you have some real data, maybe I
> can write you some more code.
>
> Ivo
>
> 2009/7/20 Omer Khalid <Omer.Khalid@cern.ch>
>
>> Hi Ivo,
>>
>>
>>
>>> The bottom line is, are you interested in:
>>>
>>> a) determining the distribution from the actual data without bothering to
>>> know the exact formula and drawing conclusions (that is find moments,
>>> probabilities,etc) from it (that is what I normally do)
>>
>>
>> Yes, I am interested in this.
>>
>>
>>
>>> b) try to determine what distribution your data fits the best (i.e., is
>>> it normal, ricean, rayleigh, nakagammi, etc)
>>
>>
>> This is partially true..
>>
>> I think I should have explained more of my research question. My program
>> is generating a real number variate X for every success. I keep on storing X
>> for each success cycle of my program and once the sample list is size 1000;
>> then I would like to use that sample space to determine the probability for
>> every next X and again store it until the sample space reaches 1000.
>>
>> I am not really concerned with the distribution type of my sample space,
>> so i thought (may be out of ignorance) that I first must determine the
>> distribution type using the fit function and then get the mean/std. Once I
>> have mean/std, then i get CDF probability for every next X and store it my
>> sample list replace the previous once.
>>
>> Basically, I want to get a probability for every X in my program cycle
>> till the next sample space reaches 1000, and keeps on doing it. This way I
>> am assuming my algorithm will learn to improve.
>>
>> But I could not figure out the proper python code  yet for this....
>>
>> Thanks,
>> Omer
>>
>> _______________________________________________
>> SciPy-user mailing list
>> SciPy-user@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20090720/b1929197/attachment-0001.html
```