[SciPy-User] R vs Python for simple interactive data analysis

Bruce Southey bsouthey@gmail....
Mon Aug 29 19:57:03 CDT 2011


On Mon, Aug 29, 2011 at 6:19 PM, Nathaniel Smith <njs@pobox.com> wrote:
> On Mon, Aug 29, 2011 at 2:51 PM,  <josef.pktd@gmail.com> wrote:
>> As an example:   mixed effects model with REML, ...
>>
>> y = X*b + Z*g, with X fixed regressors/effects and Z random effects.
>> assume design matrices X and Z are already constructed.
>>
>> Since I don't know the statistics literature well (in contrast to
>> econometrics panel data), I started to translate a matlab version to
>> help me understand this.
>> But the results don't match up, and I haven't had access to matlab for
>> a while now.
>> And I think now literal translation of long matlab functions doesn't
>> really help, compared to writing from a good textbook with checking of
>> some crucial steps.
>
> I found the "vignettes" that Doug Bates wrote alongside the lme4
> package to be pretty good descriptions of the relevant implementation
> tricks: http://cran.r-project.org/web/packages/lme4/index.html
>
> -- Nathaniel
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
Lots of memories...

As Josef said, you need the formula to create:
1) The design matrix of the fixed effects - nothing special
2) The design matrix for the random effects - somewhat interesting
3) The variance-covariance structure of the random effects - 'lots of fun'
4) The variance-covariance structure of the residual effects - 'lots of fun'

The combination of 3) and 4) addresses a huge range of models but it
gets hard really quickly.

That excludes methodology:
1) Maximum likelihood and restricted maximum likelihood are done via
iterative MIVQUE in the file Josef provided. Basically you are
iterating the mixed model equations so somewhat easy but rather slow.
2) R (Bates' with Lindstrom or Pinheiro) and SAS use second derivative
methods (here Mixed procedure with REML or ML) - probably the fast
approach
3) ASReml uses average information REML - neat approach but probably
rather uncommon for the vast majority of people.

But I don' recall Jonathan's approach with his formula code.

Bruce


More information about the SciPy-User mailing list