[SciPy-User] [OT] statistical test for comparing two measurements (with errors)
Wed Sep 14 11:09:28 CDT 2011
On 09/13/2011 06:51 PM, firstname.lastname@example.org wrote:
> On Tue, Sep 13, 2011 at 6:54 PM, David Baddeley
> <email@example.com> wrote:
>> Hi all, seeing as there are a few stats gurus on the list, I thought someone might know the answer to this question:
>> I've got two distributions and want to compare each of the moments of the distributions and determine the individual probability of each of them being equal. What I've done so far is to calculate the moments, and (using monte-carlo sub-sampling) estimate an error for each calculation.
>> This essentially gives a value and a 'measurement error' for each moment and distribution, and I'm looking for a test which will take these pairs and determine if they're likely to be equal. One option I've considered is to use/abuse the t-test as it compares two distributions with given means and std. deviations (analagous to the value and error scenario I have). What I'm struggling with is how to choose the degrees of freedom - I've contemplated using the number of Monte-Carlo iterates, but this doesn't really seem right because I'm not convinced that they will be truely independent measures.
> You are comparing raw or standardized moments?
> If you get the bootstrap standard errors of the test statistic
> (comparison of moments) then I would just use the normal instead of
> the t distribution, degrees of freedom equal to infinity.
> Alternatively, you could just use the simple bootstrap, use quantiles
> of the bootstrap distribution, or calculate a p-value based on the
> Monte Carlo distribution.
> permutations would be another way of generating a reference
> distribution under the null of equal distributions.
>> The other option I've thought of is the reciprocal of the Monte-carlo selection probability - this gives results which 'feel'
>> right, but I'm having a hard time finding a solid justification of it.
> I'm not quite sure what you mean here. Isn't the selection probability
> just 1/number of observations?
>> If anyone could suggest either an alternative test, or a suitable way of estimating degrees of freedom I'd be very grateful.
> standard tests exist for mean and variance, I haven't seen much for
> higher moments (or skew and kurtosis), either resampling (Monte Carlo,
> bootstrap, permutation) or relying on the Law of Large Numbers and
> using normal distribution, might be the only available approach.
>> To give a little more context, the underlying distributions from which I am calculating moments are 2D clouds of points and what I'm eventually aiming at is a way of quantifying shape similarity (and possibly also determining which moments give the most robust shape discrimination).
> How do you handle the bivariate features, correlation, dependence?
> Are you working on the original data or on some transformation, e.g.
> is shape similarity rotation invariant (whatever that means)? I'm
> mainly curious, it sounds like an interesting problem.
> Just to compare distributions, there would also be goodness of fit
> tests available, but most of them wouldn't help in identifying what
> the discriminating characteristics are.
>> many thanks,
You probably need to look at Kolmogorov–Smirnov and related tests (see
the 'See also' links from Wikipedia) like Anderson–Darling
If there is sufficient data and not too asymmetric then use the standard
Normal rather than t-test to avoid the degrees of freedom.
You will probably find these informative as R's 'fitdistrplus' package
seems to do what you want (no experience with these):
'FITTING DISTRIBUTIONS WITH R'
'fitdistrplus: Help to fit of a parametric distribution to non-censored
or censored data'
More information about the SciPy-User