[Numpy-discussion] Fast clip for native types, 2d version

David Cournapeau david at ar.media.kyoto-u.ac.jp
Sun Jan 14 04:30:40 CST 2007


Hi,

   First, I wanted to thank everybody who helped me to clarify many 
points concerning memory layout of numpy arrays; I think I have a much 
clearer idea of the way numpy arrays behave at the C level.
   I've used all those informations to correct my initial implementation 
of clip to improve the clip function for common cases: it speeds up 
things only for native endianness, and scalar min and max (both 
contiguous and non contiguous cases).
  
    I've attached the new version code (only for one type to avoid too 
big emails; you have to dl the archive to actually compile the 
implementation); the whole package with tests + profiling script is there:

    http://www.ar.media.kyoto-u.ac.jp/members/david/archives/fastclip.tgz

    If this looks Ok, I will prepare a patch against current numpy, with 
the C sources being generated by numpy.distutils instead of the tool I 
am using now (autogen)

    Now, to improve other cases (mainly implementing an in-place clip 
function + non scalar min/max), there are some clarifications needed, 
mainly related to broadcast rules the current clip implementation which 
seems to break numpy conventions:

   1: the old implementation returns an array which has the same 
endianness than the input array. This is a bit odd, because when the 
input is byte swapped, the returned array is still byte swapped, which 
seems to be against numpy convention. Here is some code which seem odd 
to me (code assumes little endian machine)

a = numpy.random.randn(3, 2)
b = a.astype(a.dtype.newbyteorder('>'))
c = b.copy()
assert a.dtype.isnative
assert not b.dtype.isnative
assert not c.dtype.isnative
# Endianness behaviour of basic operation with numpy arrays
print (a + b).dtype.isnative #one arg is non native -> returns native
print (b + c).dtype.isnative # both args not native -> returns native
# Now, what's happening endian-wise with clip:
print numpy.clip(a, -0.5, 0.5).dtype.isnative # everything native -> 
returns native
print numpy.clip(b, -0.5, 0.5).dtype.isnative # input array non native 
-> returns non native
print numpy.clip(b, a, 0.5).dtype.isnative    # input array non native, 
native array min -> returns native

    The fact that the output's endianness depends on min/max arguments 
being arrays or not does not seem really coherent ?

   2: the old implementation does not upcast the input array. If the 
input is int32, and min/max are float32, the function fails; if input is 
float32, and min/max float64, the output is still float32. Again, this 
seems against the expected numpy behaviour ?
   3: the old implementation supports clipping with complex arrays. I 
don't see any obvious meaningful implementation of clipping in those 
cases (using the module to compare them ?)

   If breaking those oddities is allowed, this would make the 
improvements much simpler to code,

   cheers,

   David
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fast_clip.c
Type: text/x-csrc
Size: 8695 bytes
Desc: not available
Url : http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070114/08505f05/attachment-0002.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clip_imp.c
Type: text/x-csrc
Size: 3509 bytes
Desc: not available
Url : http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070114/08505f05/attachment-0003.bin 


More information about the Numpy-discussion mailing list