[SciPy-user] usings numpy arrays in sets

Gael Varoquaux gael.varoquaux@normalesup....
Thu Jun 25 13:17:57 CDT 2009


On Thu, Jun 25, 2009 at 01:04:11PM -0500, Robert Kern wrote:
> I agree that it would be handy, but hashability is not the only
> problem. When hashes collide, the objects are then compared by
> equality. This is a problem for numpy arrays because we do not return
> bools.

> The proper fix is to make a set() implementation that allows you to
> provide your own hash and equality functions. This is a general
> solution to a problem that affects more than just numpy arrays.

I came up with this problem when I was trying to implement something like
a memoize pattern for functions that where taking in arrays. I came up
with a fairly complex solution that I don't want to expose in details
here, but it involved using the 'id' of the arrays as a hash, and
actually using this id has a key the set or dictionnary.

That should probably be considered as a band aid, but my experience is
that you can solve a lot of your hashing-related problems with that band
aid, if you take it in account when designing your code (ie you keep in
mind that you have mutables, and that id(a) != id(b) does not mean that
they do not share the data.

My 2 cents,

Gaël


More information about the SciPy-user mailing list