[Numpy-discussion] A proposal to implement round in C Was: Rookie problems - Why is C-code much faster?
Sasha
ndarray at mac.com
Tue Feb 21 17:06:08 CST 2006
[I am reposting this under a different subject because my original
post got buried in a long thread that went on to discussing unrelated
topics. Sorry if you had to read this post twice.]
It turns out that around (or round_) is implemented in python:
def round_(a, decimals=0):
"""Round 'a' to the given number of decimal places. Rounding
behaviour is equivalent to Python.
Return 'a' if the array is not floating point. Round both the real
and imaginary parts separately if the array is complex.
"""
a = asarray(a)
if not issubclass(a.dtype.type, _nx.inexact):
return a
if issubclass(a.dtype.type, _nx.complexfloating):
return round_(a.real, decimals) + 1j*round_(a.imag, decimals)
if decimals is not 0:
decimals = asarray(decimals)
s = sign(a)
if decimals is not 0:
a = absolute(multiply(a, 10.**decimals))
else:
a = absolute(a)
rem = a-asarray(a).astype(_nx.intp)
a = _nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a))
# convert back
if decimals is not 0:
return multiply(a, s/(10.**decimals))
else:
return multiply(a, s)
I see many ways to improve the performance here. First, there is no
need to check for "decimals is not 0" three times. This can be done
once, maybe at the expense of some code duplication. Second,
_nx.where(_nx.less(rem, 0.5), _nx.floor(a), _nx.ceil(a)) seems to be
equivalent to _nx.floor(a+0.5). Finally, if rint is implemented as a
ufunc as Mads originally suggested, "decimals is 0" branch can just
call that.
It is tempting to rewrite the whole thing in C, but before I do that
I have a few questions about current implementation.
1. It is implemented in oldnumeric.py . Does this mean it is
deprecated. If so, what is the recommended replacement?
2. Was it intended to support array and fractional values for
decimals or is it an implementation artifact. Currently:
>>> around(array([1.2345]*5),[1,2,3,4,5])
array([ 1.2 , 1.23 , 1.235 , 1.2345, 1.2345])
>>> around(1.2345,2.5)
array(1.2332882874656679)
3. It does nothing to exact types, even if decimals<0
>>> around(1234, -2)
array(1234)
Is this a bug? Consider that
>>> round(1234, -2)
1200.0
and
>>> around(1234., -2)
array(1200.0)
Docstring is self-contradictory: "Rounding behaviour is equivalent to
Python" is not consistent with "Return 'a' if the array is not
floating point."
I propose to deprecate around and implement a new "round" member
function in C that will only accept scalar "decimals" and will behave
like a properly vectorized builtin round. I will do the coding if
there is interest.
In any case, something has to be done here. I don't think the
following timings are acceptable:
> python -m timeit -s "from numpy import array; x = array([1.5]*1000)" "(x+0.5).astype(int).astype(float)"
100000 loops, best of 3: 18.8 usec per loop
> python -m timeit -s "from numpy import array, around; x = array([1.5]*1000)" "around(x)"
10000 loops, best of 3: 155 usec per loop
More information about the Numpy-discussion
mailing list