[Numpy-discussion] image processing using numpy-scipy?
Fri Feb 27 12:11:43 CST 2009
>> This a little wiered problem. I am having a black and white image.
>> Entire image is filled with noisy white patterns of different size
>> shape. I need to fill the
>> white patches if there area is more then given one. Logically this
>> possible to use a quickfill algorithm
>> for every pixel and if the filled area generated by that pixel is
>> more then
>> given area(in pixel) then paint that
>> patch (filled area) with black.
>> I have read the docs of PIL but there is no function for this. Can
>> I use
>> numpy-scipy for the matter?
> Sure, there are a couple of options. First, look at scipy.ndimage if
> there is anything there you can use (i.e. maybe binary dilation is
> Otherwise, I've got some connected component code at:
> (The repository is at http://mentat.za.net/hg/ccomp if you prefer)
I think that scipy.ndimage.label also provides connected-component
labeling, with arbitrary connectivity. But this problem could also be
solvable via dilation / erosion, if the area constraint is loose --
e.g. if you want to quash all blobs smaller than, say, 5x5 pixels, you
could just erode for 5 iterations, and then dilate the eroded image
back for '-1' iterations (which causes the ndimage algorithm to
iterate until no pixels change), using a mask of the original image
(so that no pixels outside of the original blobs are turned on). This
basically means that any object that survives the original erosion
will be dilated back to its initial size. (Similar tricks can also be
used to find / kill objects touching the edges -- use a non-zero
constant out-of-bounds value and dilate an all-zeros array for -1
iterations, using the original array as the mask. Then only objects
touching the edge will get filled in...)
Alternately, if you need a very stringent area threshold -- e.g.
remove all objects with five or fewer total pixels, regardless of
their configuration -- then the connected-components approach is
required. Below is a stab at it, though note that there's a slow step
that I can't right now figure out how to avoid, short of coding just
that in C or with cython or something...
import scipy.ndimage as ndimage
_4_connected = numpy.array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=bool)
def kill_small(binary_image, min_size, structure=_4_connected):
label_image, num_objects = ndimage.label(binary_image, structure)
# Label 0 is the background...
object_labels = numpy.arange(1, num_objects+1)
# Get the area of each labeled object by summing up the pixel
values in the
# binary image. (Divide the binary image by its max val to ensure
that the image
# consists of 1s and 0s, so that the sum equals the area in pixels.)
areas = numpy.array(nd.sum(binary_image / binary_image.max(),
big_objects = object_labels[areas >= min_size]
# This part will be pretty slow! But I can't think right now how to
speed it up.
# (If there are more big objects than small objects, it would be
# reverse this process and xor out the small objects from the
# rather than the below, which or's up a new image of just the
big_object_image = numpy.zeros(binary_image.shape, dtype=bool)
for bo in big_objects:
big_object_image |= label_image == bo
More information about the Numpy-discussion