[SciPy-User] Scipy's probplot compared to R's qqplot
Robert Kern
robert.kern@gmail....
Wed Mar 3 13:38:21 CST 2010
On Wed, Mar 3, 2010 at 13:09, <PHobson@geosyntec.com> wrote:
> Hey folks,
>
> I've taken more of an interest in statistics and Scipy lately and decided to compare the scipy.stats.probplot() function to R's qqplot(). For a given dataset, the results are slightly different.
>
> Here's a link to the script I wrote to do the comparison.
> http://dpaste.com/167464/
>
> Basically, it does the following:
> -Uses numpy to generate some fake, noramlly distributed data
> -Uses both R and Scipy to compute the values needed for quantile/probability plot
> -Computes linear regressions on the quantile data with both R and Scipy.
> -prints some output to compare the two
>
> My initial conclusions:
> 1) R's lm(y~x) and scipy.stats.linregress(x,y) yield the same slope and intercept of a linear model. (good)
> 2) R and Scipy compute the quantiles of a dataset in slightly different manners (??)
>
> Any clue as to why the discrepancy in #2 occurs?
There are several, slightly different but mostly reasonable ways of
computing quantiles.
> Would you consider it a big deal?
Probably not, but I'm happy to entertain arguments to the contrary if
you would care to explain how R is computing the quantiles.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
More information about the SciPy-User
mailing list