[SciPy-User] Help optimizing an algorithm
Wed Jan 30 23:10:50 CST 2013
> Thanks, but considering that in practice we can have values up to 2^16 from this camera, this seems likely to create an approximately 2^16 x 2^9 x 2^9 array, because the camera's outputs are unsigned 16-bit integers. With two bytes per element, that'd mean 32.768 GB of RAM for the lookup table. Of course it'd be much more feasible if the input range can be constrained to not be the full range of data.
> Another thing I should note is that the nonlinearity in the camera response is fairly localized -- there's one region about 20 counts wide, and another about 500 counts wide, and the rest of the response is basically linear. So outside of those two regions, we can just linearly interpolate and be confident we're getting the right value.
Ah, yeah, if you're getting legitimately 16 bits of dynamic range then a per-pixel lookup table scales rather badly!
I think that your best bet would be to look at scipy.ndimage.map_coordinates(). It's like the fancy-indexing trick I proposed in that you provide an array of indices that map into an look-up table, except now the indices can be float values and map_coordinates() interpolates the value of the look-up table between positions (interpolation is with spline fits of a specified order, including 1 aka linear).
So you'd need a 2^9 x 2^9 x n array for the table, where n is the number of "control points" for the (linear or spline) fit to your transfer functions. Then for each input image, you'd pass in a 3 x 2^9 x 2^9 array of coordinates to map into the input. In the simplest case, the coordinate array would look like this: [x_indices, y_indices, n * input.astype(float) / 2**16)], where (x_indices, y_indices) = numpy.indices((2**9,2**9)).
This would be for n control points evenly spaced across the 2^16 range, obviously. You could of course also set the positions of the control points for each transfer function differently for each pixel, which would necessitate a slightly more complex transformation of the original image's values into interpolation positions in the input array, but that's straightforward enough.
If this doesn't make clear sense, I'll send proper example code.
And if that doesn't run fast enough, you can use OpenGL to do the same thing but with the linear interpolation running on the GPU, using GLSL and a 2D texture sampler (the input image) to build up the coordinates to send to a 3D texture sampler (the lookup table). This is actually a lot less scary than it sounds, and I can give some tips if needed.
PS. What sort of camera has a true 16-bit depth and needs a per-pixel linearity correction? Is it an EMCCD?
More information about the SciPy-User