[Numpy-discussion] memoization with ndarray arguments
Mon Mar 23 03:20:17 CDT 2009
A Saturday 21 March 2009, Paul Northug escrigué:
> numpy arrays are not hashable, maybe for a good reason.
Numpy array are not hashable because they are mutable.
> I tried
> anyway by keeping a dict of hash(tuple(X)), but started having
> collisions. So I switched to md5.new(X).digest() as the hash function
> and it seems to work ok. In a quick search, I saw cPickle.dumps and
> repr are also used as key values.
Having collisions is not necessarily very bad, unless you have *a lot*
of them. I wonder what kind of X you are dealing with that can provoke
so much collisions when using hash(tuple(X))? Just curious.
> I am assuming this is a common problem with functions with numpy
> array arguments and was wondering what the best approach is
> (including not using memoization).
If md5.new(X).digest() works well for you, then go ahead; it seems fast:
In : X = np.arange(1000.)
In : timeit hash(tuple(X))
1000 loops, best of 3: 504 µs per loop
In : timeit md5.new(X).digest()
10000 loops, best of 3: 40.4 µs per loop
More information about the Numpy-discussion