[Numpy-discussion] please change mean to use dtype=float

Sebastian Haase haase at msg.ucsf.edu
Tue Sep 19 22:39:04 CDT 2006

Charles R Harris wrote:
> On 9/19/06, *Sebastian Haase* <haase at msg.ucsf.edu 
> <mailto:haase at msg.ucsf.edu>> wrote:
>     Travis Oliphant wrote:
>      > Sebastian Haase wrote:
>      >> I still would argue that getting a "good" (smaller rounding
>     errors) answer
>      >> should be the default -- if speed is wanted, then *that* could
>     be still
>      >> specified by explicitly using dtype=float32  (which would also
>     be a possible
>      >> choice for int32 input) .
>      >>
>      > So you are arguing for using long double then.... ;-)
>      >
>      >> In image processing we always want means to be calculated in
>     float64 even
>      >> though input data is always float32 (if not uint16).
>      >>
>      >> Also it is simpler to say "float64 is the default" (full stop.)
>     - instead
>      >>
>      >> "float64 is the default unless you have float32"
>      >>
>      > "the type you have is the default except for integers".  Do you
>     really
>      > want float64 to be the default for float96?
>      >
>      > Unless we are going to use long double as the default, then I'm not
>      > convinced that we should special-case the "double" type.
>      >
>     I guess I'm not really aware of the float96 type ...
>     Is that a "machine type" on any system ?  I always thought that -- e.g .
>     coming from C -- double is "as good as it gets"...
>     Who uses float96 ?  I heard somewhere that (some) CPUs use 80bits
>     internally when calculating 64bit double-precision...
>     Is this not going into some academic argument !?
>     For all I know, calculating mean()s (and sum()s, ...) is always done in
>     double precision -- never in single precision, even when the data is in
>     float32.
>     Having float32 be the default for float32 data is just requiring more
>     typing, and more explaining ...  it would compromise numpy usability as
>     a day-to-day replacement for other systems.
>     Sorry, if I'm being ignorant ...
> I'm going to side with Travis here. It is only a default and easily 
> overridden. And yes, there are precisions greater than double. I was 
> using quad precision back in the eighties on a VAX for some inherently 
> ill conditioned problems. And on my machine long double is 12 bytes.
> Chuck
I just did a web search for "long double"

and it does not look like there is much agreement on what that is - see 
also http://en.wikipedia.org/wiki/Long_double

I really think that float96 *is* a special case - but computing mean()s 
and var()s in float32 would be "bad science".
I hope I'm not alone in seeing numpy a great "interactive platform" to 
do evaluate data...
I know that having too much knowledge of the details often makes one 
forget what the "newcomers" will do and expect. We are only talking 
about people that will a) work with single-precision data (e.g. large 
scale-image analysis) and who b) will tend to "just use the default" 
(dtype)  --- How else can I say this: these people will just assume that 
arr.mean() *is* the mean of arr.


More information about the Numpy-discussion mailing list