[Numpy-discussion] 2d binning and linear regression
Mon Jun 21 06:25:11 CDT 2010
On Sun, Jun 20, 2010 at 10:57 PM, Tom Durrant <firstname.lastname@example.org> wrote:
>> are you doing something like np.polyfit(model, obs, 1) ?
>> If you are using polyfit with deg=1, i.e. fitting a straight line,
>> then this could be also calculated using the weights in histogram2d.
>> histogram2d (histogramdd) uses np.digitize and np.bincount, so I'm
>> surprised if the histogram2d version is much faster. If a quick
>> reading of histogramdd is correct, the main improvement would be to
>> get the labels "xy" out of it, so it can be used repeatedly with
> Thanks Josef,
> >From my limited understanding, you are right the histogram is much faster due to
> the fact that it doesn't have to keep reading in the array over and over....
> I am using np.polyfit(model, obs, 1). I couldn't work out a way to do these
> regression using histogram2d and weights, but you think it can be done? This
> would be great!
the basic idea is in "polyfit on multiple data points" on
numpy-disscusion mailing list April 2009
In this case, calculations have to be done by groups
subtract mean (this needs to be replaced by group demeaning)
modeldm = model - model.mean()
obsdm = obs - obs.mean()
xx = np.histogram2d(
xx, xedges, yedges = np.histogram2d(lat, lon, weights=modeldm*modeldm,
xy, xedges, yedges = np.histogram2d(lat, lon, weights=obsdm*obsdm,
slopes = xy/xx # slopes by group
expand slopes to length of original array
predicted = model - obs * slopes_expanded
the main point is to get the group functions, for demeaning, ... for
the 2d labels (and get the labels out of histogramdd)
I'm out of time (off to the airport soon), but I can look into it next weekend.
> NumPy-Discussion mailing list
More information about the NumPy-Discussion