[Numpy-discussion] New functions.
Wed Jun 1 10:31:55 CDT 2011
I'd love to see something like a "count_unique" function included. The
numpy.unique function is handy, but it can be a little awkward to
efficiently go back and get counts of each unique value after the
On Wed, Jun 1, 2011 at 8:17 AM, Keith Goodman <email@example.com> wrote:
> On Tue, May 31, 2011 at 8:41 PM, Charles R Harris
> <firstname.lastname@example.org> wrote:
>> On Tue, May 31, 2011 at 8:50 PM, Bruce Southey <email@example.com> wrote:
>>> How about including all or some of Keith's Bottleneck package?
>>> He has tried to include some of the discussed functions and tried to
>>> make them very fast.
>> I don't think they are sufficiently general as they are limited to 2
>> dimensions. However, I think the moving filters should go into scipy, either
>> in ndimage or maybe signals. Some of the others we can still speed of
>> significantly, for instance nanmedian, by using the new functionality in
>> numpy, i.e., numpy sort has worked with nans for a while now. It looks like
>> call overhead dominates the nanmax times for small arrays and this might
>> improve if the ufunc machinery is cleaned up a bit more, I don't know how
>> far Mark got with that.
> Currently Bottleneck accelerates 1d, 2d, and 3d input. Anything else
> falls back to a slower, non-cython version of the function. The same
> goes for int32, int64, float32, float64.
> It should not be difficult to extend to higher nd and more dtypes
> since everything is generated from template. The problem is that there
> would be a LOT of cython auto-generated C code since there is a
> separate function for each ndim, dtype, axis combination.
> Each of the ndim, dtype, axis functions currently has its own copy of
> the algorithm (such as median). Pulling that out and reusing it should
> save a lot of trees by reducing the auto-generated C code size.
> I recently added a partsort and argpartsort.
> NumPy-Discussion mailing list
More information about the NumPy-Discussion