[Numpy-discussion] extract elements of an array that are contained in another array?
josef.pktd@gmai...
josef.pktd@gmai...
Thu Jun 4 12:27:25 CDT 2009
On Thu, Jun 4, 2009 at 12:32 PM, Alan G Isaac <aisaac@american.edu> wrote:
> On 6/4/2009 11:29 AM josef.pktd@gmail.com apparently wrote:
>> intersect1d is the intersection between sets (which are stored as
>> arrays), just like in the mathematical definition the two sets only
>> have unique elements
>
> Hmmm. OK, I see you and Robert believe this.
> But it does not match the documentation.
> But indeed, I see that the documentation is incorrect.
> E.g.,
>
>>>> np.intersect1d([1,1,2,3,3,4],[1,4])
> array([1, 1, 3, 4])
>
> Is this a bug or a documentation bug?
>
>
>
>> intersect1d_nu is the intersection between two arrays which can have
>> repeated elements. The result is a set, i.e. unique elements, stored
>> as an array
>
>> same for setmember1d, setmember1d_nu
>
> I cannot understand this.
> Following your proposed reasoning,
> I expect a[setmember1d_nu(a,b)]
> to return the same as
> intersect1d_nu(a, b).
> It does not.
I don't have setmember1d_nu available right now, but from my reading
we should have
intersect1d_nu(a, b).== np.unique(a[setmember1d_nu(a,b)])
>
>
>
>> so postfix `_nu` only means that this function also works
>> if the two arrays are not really sets
>
> But that just begs the question: what does 'works' mean?
> See my previous comment (above).
>
>
>
>> intersect1d should throw a domain error if you give it arrays with
>> non-unique elements, which is not done for speed reasons
>
> *If* intersect1d behaved *exactly* as documented,
> the example
> intersect1d(a, np.unique(b))
> shows that the documented behavior can be useful.
> And indeed, this would be the match to
> a[setmember1d_nu(a,b)]
I'm don't know if anyone looked at the behavior for "unintented" usage
intersect1d rearranges, sorts
>>> np.intersect1d([4,1,3,3],[3,4])
array([3, 3, 4])
but it gives you the correct multiplicity
>>> np.intersect1d([4,4,4,1,3,3],np.unique([3,4,3,0]))
array([3, 3, 4, 4, 4])
so I guess, we have
np.intersect1d([4,4,4,1,3,3], np.unique([3,4,3,0])) ==
np.sort(a[setmember1d_nu(a,b)])
for the example from the help file I don't find any meaningful interpretation
>>> np.intersect1d([1,3,3],[3,1,1])
array([1, 1, 3, 3])
wrong answer
>>> np.setmember1d([4,1,1,3,3],[3,4])
array([ True, True, False, True, True], dtype=bool)
Note: there are two versions of the docs for np.intersect1d, the
currently published docs which describe the actual behavior (for the
non-unique case), and the new docs on the doc editor
http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/
that describe the "intended" usage of the functions, which also
corresponds closer to the original source docstring
(http://docs.scipy.org/numpy/docs/numpy.lib.arraysetops.intersect1d/?revision=-227
). that's my interpretation
If you think that functions make sense also for the "unintended"
usage, then you could add an example to the new docs.
Josef
More information about the Numpy-discussion
mailing list