[Numpy-discussion] creating zonal statistics from two arrays

Gregory, Matthew matt.gregory@oregonstate....
Wed Dec 8 11:12:44 CST 2010


Hi all,

Likely a very newbie type of question.  I'm using numpy with GDAL to calculate zonal statistics on images.  The basic approach is that I have a zone raster and a value raster which are aligned spatially and I am storing each zone's corresponding values in a dictionary, then calculating the statistics on that population.  (I'm well aware that this approach may have memory issues with large rasters ...)

GDAL ReadAsArray gives you a chunk of raster data as a numpy array.  Currently I'm iterating over rows and columns of that chunk, but I'm guessing there's a better (and more numpy-like) way.

zone_stats = {}
zone_block = zone_band.ReadAsArray(x_off, y_off, x_size, y_size)
value_block = value_band.ReadAsArray(x_off, y_off, x_size, y_size)
for row in xrange(y_size):
    for col in xrange(x_size):
        zone = zone_block[row][col]
        value = value_block[row][col]
        try:
            zone_stats[zone].append(value)
        except KeyError:
            zone_stats[zone] = [value]

# Then calculate stats per zone
...

Thanks for all suggestions on how to make this better, especially if the initial approach I'm taking is flawed.

matt





More information about the NumPy-Discussion mailing list