[SciPy-Dev] Deprecate stats.glm?

josef.pktd@gmai... josef.pktd@gmai...
Thu Jun 3 11:59:01 CDT 2010


On Thu, Jun 3, 2010 at 12:31 PM,  <josef.pktd@gmail.com> wrote:
> On Thu, Jun 3, 2010 at 12:16 PM, Nathaniel Smith <njs@pobox.com> wrote:
>> On Thu, Jun 3, 2010 at 8:53 AM,  <josef.pktd@gmail.com> wrote:
>>> GLM as in general linear model not generalized. (It's the worst
>>> conflicting acronym in stats).
>>
>> Sure, and lets not even talk about generalized least squares
>> (unrelated to both!).
>>
>> But the general linear model is basically identical to a simple linear
>> model, both in interface and implementation. There's no reason to have
>> a separate function for it, one should just accept a matrix for the
>> "y" variable in the OLS code. But *generalized* linear models are
>> different in interface, implementation, and are almost as much of a
>> stats workhorse as standard linear models. So every book I've ever
>> seen uses the abbreviation "glm" to refer to the generalized version.
>> (Also, this is what R calls the function ;-).)
>
> coming more from the econometrics side, I never heard of "generalized"
> until two years ago, and glm was always general linear model,
> (scikits.learn and many other packages use it in this definition)
>
>
>>
>> The implementation of dummy coding is kind of useful, but this is the
>> wrong place and the wrong name...
>>
>> (Also, its least squares implementation calls inv -- the textbook
>> example of bad numerics!)
>>
>> ...Okay, you know all that anyway, the question is what to do with it.
>> If the problem were just that it needed a better implementation and
>> some new features added, then maybe we would keep it and let it be
>> improved incrementally. But the interface is just wrong, so we'll be
>> removing it sooner or later, and it might as well be sooner, rather
>> than prolong the agony.
>
> Actually my version for stats.glm, as a test not as an estimation
> model uses least squares in the name, but has a similar interface
>
> http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/annotate/head%3A/scikits/statsmodels/sandbox/regression/onewaygls.py
>
> class OneWayLS(object):
> '''Class to test equality of regression coefficients across groups
>
> This class performs tests whether the linear regression coefficients are
> the same across pre-specified groups. This can be used to test for
> structural breaks at given change points, or for ANOVA style analysis of
> differences in the effect of explanatory variables across groups.

Actually, I don't have ttest results, because I only look at the
general case with two or more groups and only ftest is relevant in
this case, so the simplest case of it is similar to stats.f_oneway not
stats.glm

http://bazaar.launchpad.net/~scipystats/statsmodels/trunk/annotate/head%3A/scikits/statsmodels/sandbox/examples/ex_onewaygls.py#L99

And thanks Warren and Nathaniel for voicing some strong opinions, it's
very useful to break my indifference (economic utility definition).

Josef

>
> I don't see a way to provide a "better implementation and add some new
> features" without going full scale.
>
> That's why I agree now with deprecation, since after this thread it's
> not a hidden legacy/fossil anymore.
>
> Josef
>
>>
>> -- Nathaniel
>> _______________________________________________
>> SciPy-Dev mailing list
>> SciPy-Dev@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-dev
>>
>


More information about the SciPy-Dev mailing list