[SciPy-dev] Statistics toolbox and nans
rossini at blindglobe.net
Fri Nov 1 15:01:30 CST 2002
>>>>> "travis" == Travis Oliphant <oliphant at ee.byu.edu> writes:
travis> Right now, to me this is a straw man (a hypothetical argument).
I agree (i.e. NAN being the problem; it's not -- I'd probably complain
about any value that could cause confusion).
travis> Now, I agree that treating missing values using NaNs is somewhat of a
travis> kludge. And there are perhaps better ways to handle it. It is a rather
travis> efficient kludge that works much of the time.
travis> Even if you don't officially bless nan's as "missing values," If they
travis> every show up in your calculation, they essentially are missing values and
travis> the question still remains as to how to deal with them (should you ignore
travis> them or let them ruin the rest of your calculation?)
This is the crux of the issue -- from a statistical perspective
(different from a numerical analyst's, from what I can tell), it
would be important to flag different forms of missing data, in order
to process in different manners. Using a single NAN does't allow for
this (i.e. numerical missingness, vs. statistical missingness, both of
which may be present depending on the data and the data analysis
algorithm using for processing.
A.J. Rossini Rsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics rossini at u.washington.edu
FHCRC/SCHARP/HIV Vaccine Trials Net rossini at scharp.org
-------------- http://software.biostat.washington.edu/ ----------------
FHCRC: M: 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW: Th: 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX
(my tuesday/wednesday/friday locations are completely unpredictable.)
More information about the Scipy-dev