[SciPy-User] [OT] statistical test for comparing two measurements (with errors)

josef.pktd@gmai... josef.pktd@gmai...
Tue Sep 13 18:51:27 CDT 2011


On Tue, Sep 13, 2011 at 6:54 PM, David Baddeley
<david_baddeley@yahoo.com.au> wrote:
> Hi all, seeing as there are a few stats gurus on the list, I thought someone might know the answer to this question:
>
> I've got two distributions and want to compare each of the moments of the distributions and determine the individual probability of each of them being equal. What I've done so far is to calculate the moments, and (using monte-carlo sub-sampling) estimate an error for each calculation.
>
> This essentially gives a value and a 'measurement error' for each moment and distribution, and I'm looking for a test which will take these pairs and determine if they're likely to be equal. One option I've considered is to use/abuse the t-test as it compares two distributions with given means and std. deviations (analagous to the value and error scenario I have). What I'm struggling with is how to choose the degrees of freedom - I've contemplated using the number of Monte-Carlo iterates, but this doesn't really seem right because I'm not convinced that they will be truely independent measures.

You are comparing raw or standardized moments?

If you get the bootstrap standard errors of the test statistic
(comparison of moments) then I would just use the normal instead of
the t distribution, degrees of freedom equal to infinity.

Alternatively, you could just use the simple bootstrap, use quantiles
of the bootstrap distribution, or calculate a p-value based on the
Monte Carlo distribution.

permutations would be another way of generating a reference
distribution under the null of equal distributions.

> The other option I've thought of is the reciprocal of the Monte-carlo selection probability - this gives results which 'feel'
> right, but I'm having a hard time finding a solid justification of it.

I'm not quite sure what you mean here. Isn't the selection probability
just 1/number of observations?

>
> If anyone could suggest either an alternative test, or a suitable way of estimating degrees of freedom I'd be very grateful.

standard tests exist for mean and variance, I haven't seen much for
higher moments (or skew and kurtosis), either resampling (Monte Carlo,
bootstrap, permutation) or relying on the Law of Large Numbers and
using normal distribution, might be the only available approach.

>
> To give a little more context, the underlying distributions from which I am calculating moments are 2D clouds of points and what I'm eventually aiming at is a way of quantifying shape similarity (and possibly also determining which moments give the most robust shape discrimination).

How do you handle the bivariate features, correlation, dependence?
Are you working on the original data or on some transformation, e.g.
is shape similarity rotation invariant (whatever that means)?  I'm
mainly curious, it sounds like an interesting problem.
Just to compare distributions, there would also be goodness of fit
tests available, but most of them wouldn't help in identifying what
the discriminating characteristics are.

Josef

>
> many thanks,
> David
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list