[SciPy-User] Scipy's probplot compared to R's qqplot

PHobson@Geosynte... PHobson@Geosynte...
Wed Mar 3 13:09:37 CST 2010


Hey folks,

I've taken more of an interest in statistics and Scipy lately and decided to compare the scipy.stats.probplot() function to R's qqplot(). For a given dataset, the results are slightly different.

Here's a link to the script I wrote to do the comparison. 
http://dpaste.com/167464/

Basically, it does the following:
-Uses numpy to generate some fake, noramlly distributed data
-Uses both R and Scipy to compute the values needed for quantile/probability plot
-Computes linear regressions on the quantile data with both R and Scipy.
-prints some output to compare the two

My initial conclusions:
1) R's lm(y~x) and scipy.stats.linregress(x,y) yield the same slope and intercept of a linear model. (good)
2) R and Scipy compute the quantiles of a dataset in slightly different manners (??)

Any clue as to why the discrepancy in #2 occurs? Would you consider it a big deal? I'm using:
Python v2.6.2 (XP) and v2.6.4 (Karmic and Snow Leopard)
Scipy v0.7.1
Numpy v1.4.0
R v2.10.0
Rpy2 v2.0.8

Thanks,
-Paul H.


More information about the SciPy-User mailing list