[Numpy-discussion] numpy error handling
oliphant.travis at ieee.org
Sat Apr 1 12:20:01 CST 2006
Tim Hochberg wrote:
>> You can get the numarray approach back simply by setting the error in
>> the builtin scope (instead of in the local scope which is done by
> I saw that you could set it at different levels, but missed the
> implications. However, it's still missing one feature, thread local
> storage. I would argue that the __builtin__ data should actually be
> stored in threading.local() instead of __builtin__. Then you could
> setup an equivalent stack system to numpy's.
Yes, the per-thread storage escaped me. But, threading.local() only
exists in Python 2.4 and NumPy is supposed to be compatible with Python 2.3
What about PyThreadState_GetDict() ? and then default to use the builtin
dictionary if this returns NULL?
I'm actually not particularly enthused about the three name-space
lookups. Changing it to only 1 place to look may be better. It would
require a setting and restoring operation. A stack could be used, but
why not just use local variables (i.e.
save = numpy.seterr(dividebyzero='warn')
> I've used the numarray error handling stuff for some time. My
> experience with it has led me to the following conclusions:
> 1. You don't use it that often. I have about 26 KLOC that's "active"
> and in that I use pushMode just 15 times. For comparison, I use
> asarray a tad over 100 times.
> 2. pushMode and popMode, modulo spelling, is the way to set errors.
> Once the with statement is around, that will be even better.
> 3. I, personally, would be very unlikely to use the local and global
> error handling, I'd just as soon see them go away, particularly if
> it helps performance, but I won't lobby for it.
This is good feedback. I have almost zero experience with changing the
error handling. So, I'm not sure what features are desireable.
Eliminating unnecessary name-lookups is usually a good thing.
> In numarray, the stack is in the numarray module itself (actually in
> the Error object). They base their threading local behaviour off of
> thread.get_ident, not threading.local. That's not clunky at all,
> although it's arguably wrong since thread.get_ident can reuse ids from
> dead threads. In practice it's probably hard to get into trouble doing
> this, but I still wouldn't emulate it. I think that this was written
> before thread local storage, so it was probably the best that could be
Right, but thread local storage is still Python 2.4 only....
What about PyThreadState_GetDict() ?
> However, if you use threading.local, it will be clunky in a similar
> sense. You'll be storing data in a global namespace you don't control
> and you've got to hope that no one stomps on your variable name.
The PyThreadState_GetDict() documenation states that extension module
writers should use a unique name based on their extension module.
> When you have local and module level secret storage names as well
> you're just doing a lot more of that and the chance of collision and
> confusion goes up from almost zero to very small.
This is true. Similar to the C-variable naming issues.
>> So, we should at least frame the discussion in terms of what is
>> actually possible.
> Yes, sorry for spreading misinformation.
But you did point out the very important thread-local storage fact that
I had missed. This alone makes me willing to revamp what we are doing.
> In this case, overflow, underflow and dividebyzero seem pretty self
> documenting to me. And 'invalid' is pretty cryptic in both
> implementations. This may be a matter of taste, but I tend to prefer
> short pithy names for functions that I use a lot, or that crammed a
> bunch to a line. In functions like this, that are more rarely used and
> get a full line to themselves I lean to towards the more verbose.
The rarely-used factor is a persuasive argument.
> Can you elaborate on this a bit? Reading between the lines, there seem
> to be two issues related to speed here. One is the actual namespace
> lookup of the error mode -- there's a setting that says we are using
> the defaults, so don't bother to look. This saves the namespace
> lookup. Changing the defaults shouldn't affect the timing of that.
> I'm not sure how this would interact with thread local storage though.
> The second issue is that running the core loop with no checks in place
> is faster.
Basically, on the C-level, the error mode is an integer with specific
bits allocated to the various error-possibilites (2-bits per
possibility). If this is 0 then the error checking is not even done
(thus no error handling at all).
Yes the name-lookup optimization could work with any defaults (but with
thread-specific storage couldn't work anyway).
One question I have with threads and error handling though? Right now,
the ufuncs release the Python lock during computation (and re-acquire it
to do error handling if needed). If another ufunc was started by
another Python thread and ran with different error handling, wouldn't
the IEEE flags get confused about which ufunc was setting what? The
flags are only checked after each 1-d loop. If another thread set the
processor flag, the current thread could get very confused.
This seems like a problem that I'm not sure how to handle.
> It's not entirely plucked out of the error. As I recall, the decision
> was arrived at something likes this:
> 1. Errors should never pass silently (unless explicitly silenced).
> 2. Let's have everything raise by default
> 3. In practice this was no good because you often wanted to look at
> the results and see where the problem was.
> 4. OK, let's have everything warn
> 5. This almost worked, but underflow was almost never a real error,
> so everyone always overrode underflow. A default that you always
> need to override is not a good default.
> 6. So, warn for everything except underflow. Ignore that.
> And that's where numarry is today. I and other have been using that
> error system happily for quite some time now. At least I haven't heard
> any complaints for quite a while.
I can appreciate this choice, but I don't agree that errors should never
pass silently. The fact that people disagree about this is the reason
for the error handling. Note that overflow is not detected everywhere
for integers --- we have to simulate the floating-point errors for
them. Only on integer multiply is it detected. Checking for it would
slow down all other integer arithmetic --- one solution, of course is to
have two different integer additions (one that checks for overflow and
another that doesn't).
There is really a bit of work left here to do.
More information about the Numpy-discussion