[Numpy-discussion] Adding an axis argument to numpy.unique

josef.pktd@gmai... josef.pktd@gmai...
Mon Aug 19 07:39:09 CDT 2013


On Sun, Aug 18, 2013 at 7:14 PM, Joe Kington <joferkington@gmail.com> wrote:
> Hi everyone,
>
> I've recently put together a pull request that adds an `axis` kwarg to
> `numpy.unique` so that `unique`can easily be used to find unique
> rows/columns/sub-arrays/etc of a larger array.
>
> https://github.com/numpy/numpy/pull/3584
>
> Currently, this works as a warpper around `unique`. If `axis` is specified,
> it reshapes the input to a 2D contiguous array, views each row as a single
> item, then passes it on to `unique`.  For int and string dtypes, each row is
> viewed as a void dtype and therefore bitwise-equality is used for
> comparisons.  For all other dtypes, the each row is viewed as a structured
> array.
>
> The current implementation has two main drawbacks:
>
> For anything other than ints and strings, it's relatively slow.
> It doesn't work with object arrays of any sort.
>
> I'd appreciate any thoughts/feedback folks might have on both the general
> idea and this specific implementation.  It think it's a worthwhile addition,
> but I'm biased.


just a general comment

I have been missing a `unique_rows` or something like that, which
seems to be the target of this change.

However, my first interpretation of an axis argument in unique would
be that it treats each column (or whatever along axis) separately.
Analogously to max, argmax and similar.

On second thought:
unique with axis working on each column separately wouldn't create a
nice return array, because it won't be rectangular (in general)

Josef

>
> Thanks!
> -Joe
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the NumPy-Discussion mailing list