[SciPy-User] scipy.stats.nanmedian

josef.pktd@gmai... josef.pktd@gmai...
Fri Jan 22 10:46:10 CST 2010


On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman <kwgoodman@gmail.com> wrote:
> On Thu, Jan 21, 2010 at 8:18 PM,  <josef.pktd@gmail.com> wrote:
>> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
>>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
>>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote:
>>>>> That's the only was I was able to figure out how to pull 1.0 out of
>>>>> np.array(1.0). Is there a better way?
>>>>
>>>>
>>>> .item()
>>>
>>> Thanks. item() looks better than tolist().
>>>
>>> I simplified the function:
>>>
>>> def nanmedian(x, axis=0):
>>>    x, axis = _chk_asarray(x,axis)
>>>    if x.ndim == 0:
>>>        return float(x.item())
>>>    x = x.copy()
>>>    x = np.apply_along_axis(_nanmedian,axis,x)
>>>    if x.ndim == 0:
>>>        x = float(x.item())
>>>    return x
>>>
>>> and opened a ticket:
>>>
>>> http://projects.scipy.org/scipy/ticket/1098
>>
>>
>> How about getting rid of apply_along_axis?    see attachment
>>
>> I don't know whether or how much faster it is, but there is a ticket
>> that the current version is slow.
>> No hidden bug or corner case guarantee yet.
>
> It is faster. But here is one case it does not handle:
>
>>> nanmedian([1, 2])
>   array([ 1.5])
>>> np.median([1, 2])
>   1.5
>
> I'm sure it could be fixed. But having to fix it, and the fact that it
> is a larger change, decreases the likelihood that it will make it into
> the next version of scipy. One option is to make the small bug fix I
> suggested (ticket #1098) and add the corresponding unit tests. Then we
> can take our time to design a better version of nanmedian.

I didn't see the difference to np.median for this case, I think I was
taking the shape answer from the other thread on the return of splines
and interpolation.

If I change the last 3 lines to
    if nanmed.size == 1:
       return nanmed.item()
    return nanmed

then I get agreement with numpy for the following test cases

print nanmedian(1), np.median(1)
print nanmedian(np.array(1)), np.median(1)
print nanmedian(np.array([1])), np.median(np.array([1]))
print nanmedian(np.array([[1]])), np.median(np.array([[1]]))
print nanmedian(np.array([1,2])), np.median(np.array([1,2]))
print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0)
print nanmedian([1]), np.median([1])
print nanmedian([[1]]), np.median([[1]])
print nanmedian([1,2]), np.median([1,2])
print nanmedian([[1,2]]), np.median([[1,2]],axis=0)
print nanmedian([1j,2]), np.median([1j,2])

Am I still missing any cases?

The vectorized version should be faster for this case
http://projects.scipy.org/scipy/ticket/740
but maybe not for long and narrow arrays.

Josef


> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>


More information about the SciPy-User mailing list