[Numpy-discussion] Apropos ticked #913
Charles R Harris
charlesr.harris@gmail....
Wed Mar 4 23:32:20 CST 2009
On Wed, Mar 4, 2009 at 9:09 PM, David Cournapeau <
david@ar.media.kyoto-u.ac.jp> wrote:
> Charles R Harris wrote:
> >
> >
> > On Wed, Mar 4, 2009 at 1:57 PM, Pauli Virtanen <pav@iki.fi
> > <mailto:pav@iki.fi>> wrote:
> >
> > Wed, 04 Mar 2009 13:18:55 -0700, Charles R Harris wrote:
> > [clip]
> > > There are python max/min and their behaviour depends on the
> > scalar type.
> > > I haven't looked at the numpy scalars to see precisely what they
> do.
> > >
> > > Numpy max/min are aliases for amax/amin defined when the core is
> > > imported. The functions amax/amin in turn map to the array methods
> > > max/min which call the maximum.reduce/minimum.reduce ufuncs, so
> > they all
> > > propagate nans, i.e., if the array contains a nan, nan will be the
> > > return value.
> > >
> > > The nonpropagating comparisons are the ufuncs fmax/fmin and
> > there are no
> > > corresponding array methods. I think fmax/fmin should be renamed
> > > fmaximum/fminimum before the release of 1.3 and the names fmax/fmin
> > > reserved for the reduced versions to match the names amax/amin.
> > I'll do
> > > that if there are no objections.
> >
> > Aren't the nonpropagating versions of `amax` and `amin` called
> > `nanmax`
> > and `nanmin`? But these are functions, not array methods.
> >
> > What does the `f` in the beginning of `fmax` and `fmin` stand for?
> >
> >
> > The functions fmax/fmin are C standard library names, I assume the f
> > stands for floating like the f in fabs. Nanmax and nanmin work by
> > replacing nans with a fill value and then performing the specified
> > operation. For instance, nanmin replaces nans with inf. In contrast,
> > the functions fmax and fmin are real ufuncs and return nan when *both*
> > the inputs are nans, return the non-nan value when only one of the
> > inputs is a nan, and do the normal comparisons when both inputs are
> valid.
>
> Thanks for the clarification. I agree fmax/fmin is better because of the
> C convention.
Better in what way? I was suggesting renaming them to fmaximum/fminimum but
am perfectly happy with the current names if you feel fmax/fmin are better
because of the c connection. I was just looking for a reasonable short name
for fmax.reduce/fmin.reduce and thought fmax/fmin would be naturals,
unfortunately, they were already taken ;)
One thing that still bothers me a bit is the return value of fmax/fmin when
comparing two complex nan values. A complex number is a nan whenever the
real or imaginary part is nan, and currently the functions return such a
number but originally they returned a complex number with both parts set to
nan. The current implemetation was a compromise that kept the code simple
while never explicitly using a nan value, i.e., the nan came from one of the
inputs. I avoided the explicit use of a nan value because the NAN macro was
possibly unreliable at the time. I'm open to thoughts on what the behavior
should be.
> We should clearly document the difference between those
> function, though.
You mean the differences with nanmax/nanmin?
Would you have time to implement something similar for
> sort (sort is important for correct and relatively efficient support of
> nanmedian I think) ? If not, that's ok, we'll do without for 1.3 series,
I would rather take more time for the sort functions. It would be easy to
make the nans sort to one end or the other in merge sort, but I would want
to make sure that quicksort was still efficient. I'm also not convinced that
would solve the median problem. If 60% of the entries were nans would nan be
the median? If not we would have to find where the nans began or ended and
that would most likely need searchsorted to be fixed also.
So in the case of sort and median I think we should first settle what the
behavior should be, then do benchmarks and testing to see if we are happy
with the result.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20090304/df75a8d9/attachment.html
More information about the Numpy-discussion
mailing list