[SciPy-user] ndimage starting points

David Warde-Farley dwf@cs.toronto....
Thu Oct 23 02:20:30 CDT 2008

Hi all,

Lately I've been looking at ndimage for replacing some of the  
functionality in the Matlab Image Processing Toolbox and elsewhere,  
but am running into some documentation holes. The parts of ndimage  
that I can figure out how to use work brilliantly and as advertised  
(the filters module for example), but a lot of the functions in some  
submodules don't say much about what form of input they take.

I'm hoping that someone more familiar with the codebase can point me  
in the right direction, and I'll be happy to clean up whatever comes  
of this thread (and anything else I discover) so that we can put it  
into docstrings or the cookbook. I'm mainly focusing on  
ndimage.measurements for now.

So, here's the list. Any help or clarification is appreciated.

- I've figured out, without much help from the docstrings, that the  
labels= argument to many of the functions is an integer array (that  
can be) produced using the very handy label() function. This probably  
deserves a mention in the module docstring (which I am happy to write).

- label() takes an optional "structure" argument - what exactly is  
this, what form does it take, how does one create it, and in what  
circumstances should it be used? Also, is it intended to be used with  
thresholded images?

- The same question about 'structure' goes for watershed_ift, as well  
as what form the 'markers' argument takes (I'm assuming an array with  
markers.shape == input.shape). dtype is... anything numeric I guess?  
It says negatives are treated differently than positives, but nothing  

- center_of_mass() - this may be a dumb question, but this produces an  
"index" (which is not integer valued) in ndim(input) space, where  
higher values in position (i_1, i_2, ... i_n) produce more "pull" on  
the center of mass than a lower value in the same position would?

- Is there a reason that find_objects() takes a "max_label" argument  
whereas every other function takes a scalar or sequence "index"  
argument? It seems inconsistent, though there may be some good  
algorithmic reason for it.

- On a similar note, find_objects computes a "bounding box" of some  
sort when generating slices, I'm guessing? (or are slices far more  
general than I had thought?)

- histogram()'s documentation seems incomplete to me. Just to be  
clear, does it always produce a one-dimensional object, regardless of  
the dimensionality of the input?

- Some things like variance() don't immediately seem to add anything  
to the standard numpy functions, I am assuming that the ability to  
mask by label is their key advantage. Can someone confirm or correct  

That's about all I've got for now.

Thanks in advance,


More information about the SciPy-user mailing list