[Numpy-discussion] image processing using numpy-scipy?

Zachary Pincus zachary.pincus@yale....
Fri Feb 27 12:11:43 CST 2009


>> This a little wiered problem. I am having a black and white image.  
>> (black
>> background)
>> Entire image is filled with noisy white patterns of different size  
>> and
>> shape. I need to fill the
>> white patches if there area is more then given one. Logically this  
>> could
>> possible to use a quickfill algorithm
>> for every pixel and if the filled area generated by that pixel is  
>> more then
>> given area(in pixel) then paint that
>> patch (filled area) with black.
>> I have read the docs of PIL but there is no function for this. Can  
>> I use
>> numpy-scipy for the matter?
> Sure, there are a couple of options.  First, look at scipy.ndimage if
> there is anything there you can use (i.e. maybe binary dilation is
> sufficient).
> Otherwise, I've got some connected component code at:
> http://mentat.za.net/source/connected_components.tar.bz2
> (The repository is at http://mentat.za.net/hg/ccomp if you prefer)

I think that scipy.ndimage.label also provides connected-component  
labeling, with arbitrary connectivity. But this problem could also be  
solvable via dilation / erosion, if the area constraint is loose --  
e.g. if you want to quash all blobs smaller than, say, 5x5 pixels, you  
could just erode for 5 iterations, and then dilate the eroded image  
back for '-1' iterations (which causes the ndimage algorithm to  
iterate until no pixels change), using a mask of the original image  
(so that no pixels outside of the original blobs are turned on). This  
basically means that any object that survives the original erosion  
will be dilated back to its initial size. (Similar tricks can also be  
used to find / kill objects touching the edges -- use a non-zero  
constant out-of-bounds value and dilate an all-zeros array for -1  
iterations, using the original array as the mask. Then only objects  
touching the edge will get filled in...)

Alternately, if you need a very stringent area threshold -- e.g.  
remove all objects with five or fewer total pixels, regardless of  
their configuration -- then the connected-components approach is  
required. Below is a stab at it, though note that there's a slow step  
that I can't right now figure out how to avoid, short of coding just  
that in C or with cython or something...


import numpy
import scipy.ndimage as ndimage

_4_connected = numpy.array([[0, 1, 0],
                             [1, 1, 1],
                             [0, 1, 0]], dtype=bool)

def kill_small(binary_image, min_size, structure=_4_connected):
   label_image, num_objects = ndimage.label(binary_image, structure)
   # Label 0 is the background...
   object_labels = numpy.arange(1, num_objects+1)
   # Get the area of each labeled object by summing up the pixel  
values in the
   # binary image. (Divide the binary image by its max val to ensure  
that the image
   # consists of 1s and 0s, so that the sum equals the area in pixels.)
   areas = numpy.array(nd.sum(binary_image / binary_image.max(),  
label_image, object_labels))
   big_objects = object_labels[areas >= min_size]
   # This part will be pretty slow! But I can't think right now how to  
speed it up.
   # (If there are more big objects than small objects, it would be  
faster to
   # reverse this process and xor out the small objects from the  
binary image,
   # rather than the below, which or's up a new image of just the  
large objects.)
   big_object_image = numpy.zeros(binary_image.shape, dtype=bool)
   for bo in big_objects:
     big_object_image |= label_image == bo
   return big_object_image

More information about the Numpy-discussion mailing list