[SciPy-User] multivariate empirical distribution function, avoid double loop ?
Wed Aug 24 13:59:09 CDT 2011
On Wed, Aug 24, 2011 at 2:27 PM, Alan G Isaac <firstname.lastname@example.org> wrote:
> On 8/24/2011 10:23 AM, email@example.com wrote:
>> Does anyone know whether there is an algorithm that avoids the double
>> loop to get a multivariate empirical distribution function?
> I think that is pretty standard.
> I'll attach something posted awhile ago.
> It seemed right at the time, but I did
> not test it. Once upon a time it was at
> def empiricalcdf(data, method='Hazen'):
> """Return the empirical cdf.
> Methods available (here i goes from 1 to N)
> Hazen: (i-0.5)/N
> Weibull: i/(N+1)
> Chegodayev: (i-.3)/(N+.4)
> Cunnane: (i-.4)/(N+.2)
> Gringorten: (i-.44)/(N+.12)
> California: (i-1)/N
> :author: David Huard
> i = np.argsort(np.argsort(data)) + 1.
> nobs = len(data)
> method = method.lower()
> if method == 'hazen':
> cdf = (i-0.5)/nobs
> elif method == 'weibull':
> cdf = i/(nobs+1.)
> elif method == 'california':
> cdf = (i-1.)/nobs
> elif method == 'chegodayev':
> cdf = (i-.3)/(nobs+.4)
> elif method == 'cunnane':
> cdf = (i-.4)/(nobs+.2)
> elif method == 'gringorten':
> cdf = (i-.44)/(nobs+.12)
> raise 'Unknown method. Choose among Weibull, Hazen, Chegodayev, Cunnane, Gringorten and California.'
> return cdf
Unfortunately it's 1d only, and I am working on multivariate, at least
Pierre has a 1d version similar to this in scipy.stats.mstats and a,
so far unused, copy is in statsmodels.
> SciPy-User mailing list
More information about the SciPy-User