# [Numpy-discussion] Crosstabulation

Friedrich Romstedt friedrichromstedt@gmail....
Mon Jul 19 15:14:38 CDT 2010

```2010/7/19 sandric ionut <sandricionut@yahoo.com>:
> For land-use a class would be for example forest, other would be orchard
> etc. For Slope gradient I would have values which <3 and between 3 and 7
> etc. So, I will have 2 raster data with, let's say, 3 classes each: forest,
> orchards and built-up area and for slope gradient: 0-3, 3-15, 15-35. The
> cross-tabulation analysis should give me a table like:
>
>             forest orchards built-up
> 0-3         10         20           15
> 3-15       5           10           20
> 15-35     5           15           15
>
> where the numbers represents all the common cells, for example: 10 cells
> with forest correspond to 10 cells with 0-3 slope gradient interval and so
> on
> (by cells I mean the pixel from a raster data)

Okay everything is clear now.  I would suggestest looping over the
cells of the table.  E.g. when:

landuse_catmap
slope_map

are the respective maps, you can categorise the slope_map with

slope_catmap = 0 * (0 <= slope_map) * (slope_map < 3) + 1 * (3 <=
slope_map) * (slope_map < 15) + 2 * (15 <= slope_map)

to get categories 0, 1, and 2, where the zero case is included here
only for illustration, since it's just zero it can be omitted.

Then you are maybe supposed to do:

table = numpy.zeros((N_slope_cats, N_landuse_cats))

and finally the looping over its elements, where the looping over the
map cells is done entirely by numpy:

for slope_cat in xrange(0, N_slope_cats):
for landuse_cat in xrange(0, N_landuse_cats):
table[slope_cat, landuse_cat] = \
((slope_catmap == slope_cat) * \
(landuse_catmap == landuse_cat)).sum()

I believe there is some slightly faster but more obscure way doing
this table creation in one single large step by putting the maps into
a hyperdimensianal array but I also believe this is beyond our scope
here if this is already fast enough.

Friedrich
```