[SciPy-Dev] Enhancements to scipy.spatial.cKDTree
Sturla Molden
sturla@molden...
Mon Jul 16 07:34:34 CDT 2012
Den 16.07.2012 02:00, skrev Patrick Varilly:
> Wow, I must confess I wasn't quite expecting the Spanish Inquisition :-D
Sorry, don't take it personally. It is important that code that goes
into SciPy can be maintained for several years ahead, and preferably not
just by you.
>
> The original reason I got into this project was to bring a fast
> query_ball_point to cKDTree (which I really needed, the KDTree
> implementation is so slow as to be unusable for analysing a long
> molecular dynamics trajectory). When I started, I thought I was only
> porting over one method of KDTree, so I strove to keep the code
> changes to a minimum (hence not abstracting out the incremental
> updates to the minimum/maximum distances between the query point and a
> node's bounding hyperrectangle) and to keep a very similar style to
> the existing code (hence the int's, double's, malloc/free, etc.). I
> only went about porting over the rest of the methods essentially to
> submit a complete enhancement to SciPy instead of just the juicy bit I
> cared about (after parsing Anne's code, I figured 90% of the effort
> was done already). I also agree that Anne's code is really
> transparent, and should stay no matter what, but speed does matter a
> fair bit to users of kd-trees. And in my defense with respect to the
> code being "full of bugs", I tried hard to make all the unit tests
> pass on my machine before submitting and was completely unaware of the
> nightmares of 32 vs 64-bit ints in Windows 64.
Yes, Linux and Apple cheated and made a C long 64 bits. Whereas AMD
designed AMD64 (aka x64) to use a 64-bit pointer with a 32-bit offset.
Windows followed that pattern and uses a 32-bit long. Whereas on IA64,
Intel has specified a 64-bit offset, so Windows uses a 64-bit long
there. It is actually Windows that does this correctly (pedantially
speaking), but the 64-bit long on Apple and Linux obviously makes life
easier. On the good side, this makes Windows 64 an excellent pantform
for tracing down integer size bugs :)
>
> That said, now that the initial effort is complete, it's obvious that
> a clear pattern recurs in the kd-tree walking algorithms: keeping
> incremental track of the minimum/maximum distances between two
> hyperrectangles as they're successively split (alternatively, a
> hyperrectangle and a point). If that primitive is abstracted, the
> code becomes a lot more readable again. I've had a first go at what
> such an abstraction might look like, and updated the code to
> query_ball_tree. Does this look like a way forward for enhancing the
> code's maintainability?
I'll look at it, but this is a very important issue.
I think this rewrite cannot be used in SciPy unless it is so
readable/understandable that several SciPy developers can understand and
change the code. At least this is my main objection with it right now.
Also observe that Cython has closures now, and you can nest Python
functions in Cython (and declaring them cpdef makes the call almost as
efficient as a cdef).
>
> Here are another couple of issues / questions:
>
> * Is there any difference between
> <np.float64_t*>np.PyArray_DATA(my_array) and &my_array[0] ? If not,
> the second choice is clearly preferable.
>
I checked Dag Sverre's Cython code again.
<np.float64_t*>np.PyArray_DATA(my_array) and &my_array[0] are equivalent.
my_Array.data is wrong.
Go ahead and change my code.
&my_array[0] will also make it easier to update to memoryviews when
Python 2.4 support is dropped.
> * I get the issues with int vs np.int <http://np.int> vs np.npy_intp,
> but is there really any SciPy-supported platform in which double
> doesn't mean np.float64_t?
Do not use C int, ever. (Except where we know it does not overflow, such
as the void functions I changed to propagate errors.)
np.int should be a C long (or C int?), for which you don't know the
size, and it might not correspond to the indexer we want (npy_intp or
Py_ssize_t). That is why you should either use np.int32 or np.int64, and
check at runtime which is applicable.
Also do not use dtype='i' or anything of that sort.
>
> * As for memory management, it's probably easier to let NumPy & Python
> handle most things, i.e. make Rectangle store two NumPy arrays. This
> needs a bit of surgery to do right, but I've tried to show what I mean
> in the query_ball_tree version I'm submitting now.
>
I thought the same thing. You can often use NumPy instead of malloc in
your code. Then you could also remove the try/finally blocks I added to
your code (but leave those I added to Anne's cKDTree code).
> * I fixed a small glitch with calls to realloc(): Sturla's changes for
> handling memory errors ended up setting the first argument to realloc
> to NULL, which effectively wipes out the contents of the data that
> were there before the resize.
Thanks.
> * There seems to be something funny about raw_mins and raw_maxes: they
> are member variables of cKDTree, but the data they point to seems to
> me like it should go out of scope at the end of cKDTree.__init__. Am
> I misunderstanding something here?
Yes you do :) See below:
cdef class cKDTree:
cdef readonly object maxes
def __init__(cKDTree self):
cdef np.ndarray[np.float64_t, ndim=1] inner_maxes
self.maxes = np.ascontiguousarray(np.amax(self.data,axis=0),
dtype=np.float64)
inner_maxes = self.maxes
self.raw_maxes = <np.float64_t*>np.PyArray_DATA(inner_maxes)
"maxes" stores the contiguous array as an attribute of the cKDTree
extension object. It will be alive until the object of type cKDTree is
reclaimed.
"inner_maxes" unboxes the array struct to a collection of local
variables for fast indexing access inside cKDTree. It also tells Cython
that the type is np.ndarray (which extends NumPy's PyArrayObject).
Then finally "raw_maxes" gets the C pointer to the contiguous buffer
stored in "maxes", by using the unboxed "inner_maxes".
We could also have done this (at least for Python 2.x):
self.raw_maxes = <np.float64_t*>
cpython.PyLong_AsVoidPtr(self.maxes.__array_interface__['data'][0])
or this
self.raw_maxes = &inner_maxes[0]
But the current "inner_maxes.data" should not be used.
Another issue is this:
Where ever you use list.append and set.add to save a tuple of indexes
(i,j), do this:
if sizeof(long) < sizeof(np.npy_intp):
# Win 64
if self.raw_indices[i] < self.raw_indices[j]:
results.add((int(self.raw_indices[i]), int(self.raw_indices[j])))
else:
results.add((int(self.raw_indices[j]), int(self.raw_indices[i])))
else:
# Other platforms
if self.raw_indices[i] < self.raw_indices[j]:
results.add((self.raw_indices[i], self.raw_indices[j]))
else:
results.add((self.raw_indices[j], self.raw_indices[i]))
sizeof(long) < sizeof(np.npy_intp) is a compile time constant so it will
be optimized by the compiler. Then you can take out the stuff I added to
fix the set for Win64 before the return of query_pairs.
You might want to put this into two inline functions (for list.append
and set.add) to avoid repeating yourself.
cdef inline int set_add_pair(set results, np.npy_intp i, np.npy_intp j)
except -1:
if sizeof(long) < sizeof(np.npy_intp):
# Win 64
if i < j:
results.add((int(j), int(j)))
else:
results.add((int(j), int(i)))
else:
# Other platforms
if i < j:
results.add((i, j))
else:
results.add((j, i))
return 0
The difference is that on Win 64 we should use Python long istead of
Python int if a C long (i.e. Pyhton int) overflows, which the function
int() will ensure. Cython automatically converts np.npy_intp to Python
long on Win 64, which we want to convert to a Python int if it is
possible. On other platforms we don't want this extra overhead.
I think Cython will optimize set.add, otherwise you should put it in a
local variable istead of getting a dict lookup on results.add (I know
list.append is optimized).
The int return value for set_add_pair is used to propagate Python
exceptions, which can result from results.add() or int().
Sturla
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/scipy-dev/attachments/20120716/b5d6bcf9/attachment-0001.html
More information about the SciPy-Dev
mailing list