[Scipy-tickets] [SciPy] #1155: ndimage.filters and ndimage.interpolation is incompatible with masked_array
SciPy Trac
scipy-tickets@scipy....
Tue Nov 20 00:17:23 CST 2012
#1155: ndimage.filters and ndimage.interpolation is incompatible with masked_array
-----------------------------------+----------------------------------------
Reporter: sam | Owner: somebody
Type: defect | Status: new
Priority: high | Milestone: Unscheduled
Component: scipy.ndimage | Version: 0.11.0
Keywords: ndimage, masked_array |
-----------------------------------+----------------------------------------
Changes (by dshean):
* cc: dshean (added)
* priority: normal => high
* version: 0.7.0 => 0.11.0
Comment:
This issue has become a real problem. I work with 2D geospatial data that
contains missing values, which means that I can't use many of the core
scipy.ndimage utilities.
My workflows typically involve loading a geotiff as a masked array:
{{{
#Given input filename, return a masked array for specified band
def fn_getma(src_fn, bnum=1, ndv=0):
src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
b = src_ds.GetRasterBand(bnum)
b_ndv = b.GetNoDataValue()
if (b_ndv is not None):
ndv = b_ndv
bma = numpy.ma.masked_equal(b.ReadAsArray(), ndv)
return bma
}}}
The scipy.ndimage package is great, and I use it often. Unfortunately, as
noted by the original post, the filters and interpolation routines can't
handle masked arrays.
I've had some success using the following hack, passing a masked array
filled with numpy.nan to the desired scipy.ndimage function f_a:
{{{
def nanfill(a, f_a, *args, **kwargs):
ndv = a.fill_value
b = f_a(a.filled(numpy.nan), *args, **kwargs)
out = numpy.ma.fix_invalid(b, copy=False)
out.set_fill_value(ndv)
return out
}}}
For example:
{{{
b = fn_getma('img.tif')
nanfill(b, scipy.ndimage.gaussian_filter, 2)
}}}
This works, but the nan regions are improperly expanded. The desired
behavior would be to ignore any nans within the filter window and return
an output value using the remaining valid input values. Instead, nan is
returned whenever a single nan is encountered within the filter window.
My latest issue involves with scipy.ndimage.interpolation.map_coordinates.
I need to interpolate values for non-integer indices over a masked array
containing "holes":
{{{
#Arrays containing non-integer pixel coordinates
pX = array([ 521.61974178, 520.18007917, 531.9847472 , ...,
382.70468842, 382.04165718, 381.38110078])
pY = array([ 1551.08334089, 1540.31571092, 1626.53056875, ...,
2556.5575993 , 2562.73110509, 2568.90483958])
#Rounding to nearest int
b[numpy.around(pY).astype(int), numpy.around(pX).astype(int)]
masked_array(data = [194.732543945 187.755401611 192.118453979 ...,
308.895629883 311.699554443 306.658691406], mask = [False False False ...,
False False False], fill_value = 0.0)
#Interpolating with map_coordinates, uses default b.fill_value (0), which
will be included during interpolation, affecting output values near
"holes".
scipy.ndimage.interpolation.map_coordinates(b, [pY, pX])
array([ 194.61734009, 187.88977051, 192.112854 , ..., 309.16894531,
312.19412231, 306.87319946], dtype=float32)
#Attempt to fill holes with nan before using map_coordinates. Returns all
nans, even for regions that should have sufficient valid data for the
interpolation.
scipy.ndimage.interpolation.map_coordinates(b.filled(numpy.nan), [pY, pX])
array([ nan, nan, nan, ..., nan, nan, nan], dtype=float32)
}}}
I'm out of ideas for workarounds. As more and more users recognize the
power of masked arrays, especially for image processing tasks, I'm certain
that this issue will continue to come up. I hope that there is a
relatively simple solution.
--
Ticket URL: <http://projects.scipy.org/scipy/ticket/1155#comment:2>
SciPy <http://www.scipy.org>
SciPy is open-source software for mathematics, science, and engineering.
More information about the Scipy-tickets
mailing list