[Numpy-discussion] sorting -inf, nan, inf
Chris.Barker at noaa.gov
Wed Sep 20 17:18:27 CDT 2006
Charles R Harris wrote:
> Thinking a bit, keeping the values in place isn't easy.
Why the heck would "in place" be desirable for sorted data anyway? I
understand that it means that if there is a NaN in the nth position
before sorting, there will be one in the nth position after sorting.
However, I see absolutely no reason at all why that would be useful (or
any more useful than putting them anywhere else)
A couple years ago, there was a long debate on this list about whether
numpy should pass -inf, NaN, and +inf through all the ufuncs without
error. there were two schools of thought:
1) They indicate a problem, the programmer should know about hat problem
as soon as it occurs, not at the end of the computation, many steps
later, when they might get presented with nothing but NaNs.
2) The whole point of "vector" computation is that you can act on a
whole bunch of numbers at once. If only subset of those numbers are
invalid, why stop the process. Raising an error when a single number has
a problem defeats the purpose of vector operations.
It seems that numpy has settled on school of thought (2), at least by
default. That being the case, it should apply to sorting also. If it
does, then that means no exception will be raised, but it makes no
difference where the heck the NaNs end up in the sorted array, as long
as everything else is in order. NaN means exactly what it's called: it's
not a number, so it doesn't matter what you do with them, as long as
they are preserved and don't mess up other things. Let the coder decide
what they want to so with them, and when they want to do it. Personally,
I'd prefer that they all ended up at the beginning or end after sorting,
but it really doesn't much matter.
That being said, if it's impossible to do a efficient sort with NaNs
mixed in, then we'll just have to live with it. It really would be best
if an exception was raised if the non-NaN values are not going to be
sorted correctly -- that really would surprise people!
> It would probably also not be unreasonable to punt and document sort
> as failing in the presence of nans.
That would be one of the worst options, but may be the only one available.
Christopher Barker, Ph.D.
NOAA/OR&R/HAZMAT (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker at noaa.gov
More information about the Numpy-discussion