[Numpy-discussion] A better median function?
Fri Aug 21 14:08:50 CDT 2009
On Fri, Aug 21, 2009 at 11:33 AM, Matthew Brett<email@example.com> wrote:
> Nicolas investigated algorithms that find the lower (or upper) median
> value. The lower median is the median iff there are an odd number of
> entries in our list, or the lower of the central values in the sort,
> when there are an even number of values in the list. So, we need the
> upper _and_ lower median when there are an even number of entries. I
> guess the necessity of doing those two related searches may change the
> relative strengths of the algorithms, but I'm sure this is a
> well-known problem.
My trivial solution to this was that since the data is now in two
"partitions", all one needs to do is quickly scan through the upper
partition to find the minimum value. Since you already have the lower
median from the select run, this minimum is by definition the upper
median. This can be averaged with the lower median and you are done.
Brain dead, perhaps, but it worked for my test. I did not (yet)
investigate whether this minimum can be located in some more expedient
fashion (i.e. did the select put the minimum in some consistent place
where we don't have to scan through half the input array?).
More information about the NumPy-Discussion