[Numpy-discussion] scipy.base (Numeric3) ready for alpha use
oliphant at ee.byu.edu
Wed Aug 10 11:53:14 CDT 2005
For anybody interested in the development of scipy.base. The repository
is in a state that can be tested and played with.
I'm sure there are bugs, but I've removed the ones I've found. I'd be
interested in help in tracking down others.
Over the next few weeks, we will be attempting to build scipy using the
new scipy.base. This should also help iron out some problems.
Some of the notable features that came out of the ufunc adapting process are
1) reductions can now take place over a type different than the array
type. Thus, if B is a byte array you
can reduce over a long type to avoid modular arithmetic (overflow).
2) reduceat now takes an axis argument
3) copies are not made of large arrays but a buffering-scheme is used
for casting and mis-behaved arrays.
4) the size of buffers used and what is meant by "large array" can be
adjusted on a per function / module / global basis by setting the
variable UFUNC_BUFSIZE in the local / module / global (builtin) scope
5) how errors in ufuncs are handled can be set and over-ridden on a
function / module / global basis through the variable UFUNC_ERRMASK
6) you have another option besides ignore, warn, or raise. You can
specify a Python function to call when an error occurs through the
variable UFUNC_ERRFUNC (right now all errors go to this same function
and a string is based indiciating which error has occurred).
I should explain this idea of using variables to set information for the
ufuncs. It comes out of an idea that Guido mentioned while Perry, Paul,
and I met with him back in March. When he was informed of numarray's
stack approach to error handling he questioned that design decision. He
wondered if the error handling could not be defined on a per module basis.
With that idea, it was relatively straightforward to implement a
procedure wherein error behavior for ufuncs is determined by
looking in the local, then global (module level), and finally builtin
scope for a specific variable.
This look-up is done at the beginning of the ufunc call. It will
obviously add some time to code which loops through repeated look up
calls (how much I'm not sure). Perhaps there is a way to ameliorate
this, but until we see some performance issues, I'm not inclined to
spend too much time on premature optimization.
Comments and especially people with an inkling to try very alpha code
out (i.e. it could segfault on you) are welcomed.
More information about the Numpy-discussion