[SciPy-User] ndimage masked array handling

David Shean dshean@gmail....
Mon Nov 19 02:28:06 CST 2012

Hi all,
I'm running into issues similar to those described in this ticket (opened 3 years ago):

My workflows typically involve loading a geotif as a masked array:

#Given input dataset, return a masked array for the input band
def fn_getma(src_fn, bnum=1, ndv=0):
    src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
    b = src_ds.GetRasterBand(bnum)
    b_ndv = b.GetNoDataValue()
    if (b_ndv is not None):
        ndv = b_ndv
    bma = numpy.ma.masked_equal(b.ReadAsArray(), ndv)
    return bma

The scipy.ndimage package is great, and I use it often.  Unfortunately, the filters and interpolation routines can't handle masked arrays.  

I've had some success using the following hack, passing a masked array filled with numpy.nan to the desired scipy.ndimage function f_a:

def nanfill(a, f_a, *args, **kwargs):
    ndv = a.fill_value  
    b = f_a(a.filled(numpy.nan), *args, **kwargs)
    out = numpy.ma.fix_invalid(b, copy=False)
    return out

For example:
b = fn_getma('img.tif')
nanfill(b, scipy.ndimage.gaussian_filter, 2)

This works, but the nan regions are improperly expanded.  The desired behavior would be to ignore any nans within the filter window and return an output value using the remaining valid input values.  Instead, nan is returned whenever a single nan is encountered within the filter window.

My current issue is with scipy.ndimage.interpolation.map_coordinates.  I need to interpolate values for non-integer indices over a masked array containing "holes":

#Arrays containing pixel coordinates
pX = array([ 521.61974178,  520.18007917,  531.9847472 , ...,  382.70468842,  382.04165718,  381.38110078])
pY = array([ 1551.08334089,  1540.31571092,  1626.53056875, ...,  2556.5575993 ,  2562.73110509,  2568.90483958])

#Rounding to nearest int
b[numpy.around(pY).astype(int), numpy.around(pX).astype(int)]
masked_array(data = [194.732543945 187.755401611 192.118453979 ..., 308.895629883 311.699554443 306.658691406], mask = [False False False ..., False False False], fill_value = 0.0)

#Interpolating with map_coordinates, uses default b.fill_value, which will be considered valid during interpolation
scipy.ndimage.interpolation.map_coordinates(b, [pY, pX])
array([ 194.61734009,  187.88977051,  192.112854  , ...,  309.16894531,  312.19412231,  306.87319946], dtype=float32)

#Fill holes with nan.  Returns all nans, even for regions with sufficient valid values
scipy.ndimage.interpolation.map_coordinates(b.filled(numpy.nan), [pY, pX])
array([ nan,  nan,  nan, ...,  nan,  nan,  nan], dtype=float32)

An alternative approach would be to fill all holes up front with inpainting, but I would prefer to limit my analysis to valid values.

Hopefully this is all clear.  Any other ideas for workarounds?  Perhaps the ticket should be updated, or priority bumped?  Thanks.

More information about the SciPy-User mailing list