[Scipy-tickets] [SciPy] #1155: ndimage.filters and ndimage.interpolation is incompatible with masked_array

SciPy Trac scipy-tickets@scipy....
Tue Nov 20 00:17:23 CST 2012

#1155: ndimage.filters and ndimage.interpolation is incompatible with masked_array
 Reporter:  sam                    |       Owner:  somebody   
     Type:  defect                 |      Status:  new        
 Priority:  high                   |   Milestone:  Unscheduled
Component:  scipy.ndimage          |     Version:  0.11.0     
 Keywords:  ndimage, masked_array  |  
Changes (by dshean):

 * cc: dshean (added)
  * priority:  normal => high
  * version:  0.7.0 => 0.11.0


 This issue has become a real problem.  I work with 2D geospatial data that
 contains missing values, which means that I can't use many of the core
 scipy.ndimage utilities.

 My workflows typically involve loading a geotiff as a masked array:

 #Given input filename, return a masked array for specified band
 def fn_getma(src_fn, bnum=1, ndv=0):
    src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
    b = src_ds.GetRasterBand(bnum)
    b_ndv = b.GetNoDataValue()
    if (b_ndv is not None):
        ndv = b_ndv
    bma = numpy.ma.masked_equal(b.ReadAsArray(), ndv)
    return bma

 The scipy.ndimage package is great, and I use it often.  Unfortunately, as
 noted by the original post, the filters and interpolation routines can't
 handle masked arrays.

 I've had some success using the following hack, passing a masked array
 filled with numpy.nan to the desired scipy.ndimage function f_a:

 def nanfill(a, f_a, *args, **kwargs):
    ndv = a.fill_value
    b = f_a(a.filled(numpy.nan), *args, **kwargs)
    out = numpy.ma.fix_invalid(b, copy=False)
    return out

 For example:

 b = fn_getma('img.tif')
 nanfill(b, scipy.ndimage.gaussian_filter, 2)

 This works, but the nan regions are improperly expanded.  The desired
 behavior would be to ignore any nans within the filter window and return
 an output value using the remaining valid input values. Instead, nan is
 returned whenever a single nan is encountered within the filter window.

 My latest issue involves with scipy.ndimage.interpolation.map_coordinates.
 I need to interpolate values for non-integer indices over a masked array
 containing "holes":

 #Arrays containing non-integer pixel coordinates
 pX = array([ 521.61974178,  520.18007917,  531.9847472 , ...,
 382.70468842,  382.04165718,  381.38110078])
 pY = array([ 1551.08334089,  1540.31571092,  1626.53056875, ...,
 2556.5575993 ,  2562.73110509,  2568.90483958])

 #Rounding to nearest int
 b[numpy.around(pY).astype(int), numpy.around(pX).astype(int)]
 masked_array(data = [194.732543945 187.755401611 192.118453979 ...,
 308.895629883 311.699554443 306.658691406], mask = [False False False ...,
 False False False], fill_value = 0.0)

 #Interpolating with map_coordinates, uses default b.fill_value (0), which
 will be included during interpolation, affecting output values near
 scipy.ndimage.interpolation.map_coordinates(b, [pY, pX])
 array([ 194.61734009,  187.88977051,  192.112854  , ...,  309.16894531,
 312.19412231,  306.87319946], dtype=float32)

 #Attempt to fill holes with nan before using map_coordinates.  Returns all
 nans, even for regions that should have sufficient valid data for the
 scipy.ndimage.interpolation.map_coordinates(b.filled(numpy.nan), [pY, pX])
 array([ nan,  nan,  nan, ...,  nan,  nan,  nan], dtype=float32)

 I'm out of ideas for workarounds.  As more and more users recognize the
 power of masked arrays, especially for image processing tasks, I'm certain
 that this issue will continue to come up.  I hope that there is a
 relatively simple solution.

Ticket URL: <http://projects.scipy.org/scipy/ticket/1155#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.

More information about the Scipy-tickets mailing list