[Numpy-discussion] algorithm for faster median calculation ?

Christoph Deil deil.christoph@googlemail....
Tue Jan 15 13:49:47 CST 2013


On Jan 15, 2013, at 8:31 PM, Jerome Caron <jerome_caron_astro@ymail.com> wrote:

> Dear all,
> I am new to the Numpy-discussion list.
> I would like to follow up some possibly useful information about calculating median.
> The message below was posted today on the AstroPy mailing list.
> Kind regards
> Jerome Caron
>  
> #----------------------------------------
> I think the calculation of median values in Numpy is not optimal. I don't know if there are other libraries that do better?
> On my machine I get these results:
> >>> data = numpy.random.rand(5000,5000)
> >>> t0=time.time();print numpy.ma.median(data);print time.time()-t0
> 0.499845739822
> 15.1949999332
> >>> t0=time.time();print numpy.median(data);print time.time()-t0
> 0.499845739822
> 4.32100009918
> >>> t0=time.time();print aspylib.astro.get_median(data);print time.time()-t0
> [ 0.49984574]
> 0.90499997139
> >>>
> The median calculation in Aspylib is using C code from Nicolas Devillard (can be found here: http://ndevilla.free.fr/median/index.html) interfaced with ctypes.
> It could be easily re-used for other, more official packages. I think the code also finds quantiles efficiently.
> See: http://www.aspylib.com/
>  
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

Hi Jerome,

some of the numpy devs are already discussing how to best implement the fast median for numpy here:
https://github.com/numpy/numpy/issues/1811 "median in average O(n) time"

If you want to get an email when someone posts a comment on that github ticket, sign up for a free github account, then click on "watch tread" at the bottom of that issue.

Note that numpy is BSD-licensed, so they can't take GPL-licensed code.
But I think looking at the method you have in aspylib is OK, so thanks for sharing!

Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20130115/0899bf46/attachment.html 


More information about the NumPy-Discussion mailing list