[Numpy-discussion] coercion rules for float32 in numpy are different from numarray

Sebastian Haase haase at msg.ucsf.edu
Fri Aug 25 14:32:25 CDT 2006


On Friday 25 August 2006 12:19, Charles R Harris wrote:
> Hi,
>
> On 8/25/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> > Sebastian Haase wrote:
> > >> This is now the behavior in SVN.   Note that this is different from
> >
> > both
> >
> > >> Numeric (which gave an error) and numarray (which coerced to float32).
> > >>
> > >> But, it is consistent with how mixed-types are handled in calculations
> > >> and is thus an easier rule to explain.
> > >>
> > >> Thanks for the testing.
> > >>
> > >> -Travis
> > >
> > > How hard would it be to change the rules back to the numarray behavior
> > > ?
> >
> > It wouldn't be hard, but I'm not so sure that's a good idea.   I do see
> > the logic behind that approach and it is worthy of some discussion.
> > I'll give my current opinion:
> >
> > The reason I changed the behavior is to get consistency so there is one
> > set of rules on mixed-type interaction to explain. You can always do
> > what you want by force-casting your int32 arrays to float32.    There
> > will always be some people who don't like whichever behavior is
> > selected, but we are trying to move NumPy in a direction of consistency
> > with fewer exceptions to explain (although this is a guideline and not
> > an absolute requirement).
> >
> > Mixed-type interaction is always somewhat ambiguous.  Now there is a
> > consistent rule for both universal functions and other functions (move
> > to a precision where both can be safely cast to --- unless one is a
> > scalar and then its precision is ignored).
>
> I think this is a good thing. It makes it easy to remember what the
> function will produce. The only oddity the user has to be aware of is that
> int32 has more precision than float32. Probably not obvious to a newbie,
> but a newbie will probably be using the double defaults anyway. Which is
> another good reason for making double the default type.
Not true - a numpy-(or numeric-programming) newbie working in medical imaging 
or astronomy  would still get float32 data to work with. He/She would do some 
operations on the data and be surprised that memory (or disk space) blows up.

>
> If you don't want that to happen, then be clear about what data-type
>
> > should be used by casting yourself.   In this case, we should probably
> > not try and guess about what users really want in mixed data-type
> > situations.
>
> I wonder if it would be reasonable to add the dtype keyword to hstack
> itself? Hmmm, what are the conventions for coercions to lesser precision?
> That could get messy indeed, maybe it is best to leave such things alone
> and let the programmer deal with it by rethinking the program. In the float
> case that would probably mean using a float32 array instead of an int32
> array.
>
> Chuck

I think my main argument is that float32 is a very common type in (large) data 
processing to save memory.
But I don't know about how many exceptions like an extra "float32 rule" we can 
handle ...

I would like to hear how the numarray (STScI) folks think about this.  Who 
else works with data of the order of GBs !?

- Sebastian





More information about the Numpy-discussion mailing list