[SciPy-Dev] Deprecate stats.glm?

josef.pktd@gmai... josef.pktd@gmai...
Thu Jun 3 11:31:25 CDT 2010

On Thu, Jun 3, 2010 at 12:16 PM, Nathaniel Smith <njs@pobox.com> wrote:
> On Thu, Jun 3, 2010 at 8:53 AM,  <josef.pktd@gmail.com> wrote:
>> GLM as in general linear model not generalized. (It's the worst
>> conflicting acronym in stats).
> Sure, and lets not even talk about generalized least squares
> (unrelated to both!).
> But the general linear model is basically identical to a simple linear
> model, both in interface and implementation. There's no reason to have
> a separate function for it, one should just accept a matrix for the
> "y" variable in the OLS code. But *generalized* linear models are
> different in interface, implementation, and are almost as much of a
> stats workhorse as standard linear models. So every book I've ever
> seen uses the abbreviation "glm" to refer to the generalized version.
> (Also, this is what R calls the function ;-).)

coming more from the econometrics side, I never heard of "generalized"
until two years ago, and glm was always general linear model,
(scikits.learn and many other packages use it in this definition)

> The implementation of dummy coding is kind of useful, but this is the
> wrong place and the wrong name...
> (Also, its least squares implementation calls inv -- the textbook
> example of bad numerics!)
> ...Okay, you know all that anyway, the question is what to do with it.
> If the problem were just that it needed a better implementation and
> some new features added, then maybe we would keep it and let it be
> improved incrementally. But the interface is just wrong, so we'll be
> removing it sooner or later, and it might as well be sooner, rather
> than prolong the agony.

Actually my version for stats.glm, as a test not as an estimation
model uses least squares in the name, but has a similar interface


class OneWayLS(object):
'''Class to test equality of regression coefficients across groups

This class performs tests whether the linear regression coefficients are
the same across pre-specified groups. This can be used to test for
structural breaks at given change points, or for ANOVA style analysis of
differences in the effect of explanatory variables across groups.

I don't see a way to provide a "better implementation and add some new
features" without going full scale.

That's why I agree now with deprecation, since after this thread it's
not a hidden legacy/fossil anymore.


> -- Nathaniel
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev

More information about the SciPy-Dev mailing list