[SciPy-dev] design of a physical quantities package: seeking comments
Sat Aug 2 15:23:43 CDT 2008
2008/8/2 Darren Dale <email@example.com>:
> I have been thinking about how to handle physical quantities by subclassing
> ndarray and building on some of the ideas and code from
> Charles Doutriaux python wrappers of udunits and Enthought's units package.
This sounds like a very handy tool, but you want to be careful to keep
> I would like to share my current thinking, and would appreciate some feedback
> at these early stages so I can get the basic design right.
> The proposed Quantity object would be an ndarray subclass with an additional
> attribute and property:
> * Quantity has a private attribute, a Dimensions container, which contains
> zero or more Dimension objects, each having associated units (a string) and a
> power (a number). The container object would be equipped with the various
> __add__, __sub__, __mul__, etc. Before performing the operations on the
> ndarray values, the operation would be performed with the Dimensions
> containers, either updating self's Dimensions and yielding the conversion
> factors required to scale the other quantity's values to perform the ndarray
> operation, or raising an exception because the two dimensionalities are not
> commensurate for the particular operation. I think this container approach is
> necessary in order to allow scaling of each Dimension's units individually,
> simplifying operations like 11 ft mA / ns * 1 hour = many ft mA.
I think it's a good idea to try to keep things in the units they are
provided in; this should reduce the occurrences of unexpected
overflows when (for example) cubing a distance in megaparsecs (put
this in centimetres and use single precision and you might overflow).
But this does mean you need to decide, when adding (say) a distance in
feet to a distance in metres, which unit the result should be in.
Users will presumably also want some kind of unit normalization
function that just converts its input to SI. You will also have to
decide at what point simplifications occur - do ft/m immediately get
converted as soon as they are produced? What about units like pc/cm^3
(actually in very common use in radio astronomy)? How do you preserve
pc/m^3 without getting abominations like kg m^2 s^2/kg m s^2?
How are users going to specify units? Using the existing packages, I
found it useful to make the units into variables:
kg = Unit("kg")
so that I could then do
wt = 10*kg
and the error checking would behave nicely.
> * Quantity has a public units property, providing a view into the object's
> dimensions and the ability to change from one set of units to another.
> q.units would return the Dimensions instance, whose __repr__ would be
> dynamically constructed from each dimension's units and power attributes. The
> setter would have some limitations by design. For example if q has units of
> kg m / s^2 and you do q.units='ft', then q.units would return kg ft /s^2.
Hmm. This kind of guessing is likely to trip people up. At least the
default should be "convert to exactly the unit I specified". After
all, the point of using units is that they catch many mathematical
errors, and the sooner they are caught the better. There is something
to be said for a "unit globbing" system ("convert all occurrences of
metres to feet but leave everything else alone") and for conversion to
predefined unit systems (MKS, CGS, "Imperial", metric with
prefixes...) but I don't think it should be the default.
> I think the Dimensions container may provide enough abstraction to handle more
> unusual operations if someone wanted to add them. Robert Kern suggested a few
> years back (see
> http://aspn.activestate.com/ASPN/Mail/Message/scipy-user/2538532) that a good
> physical quanitities system should be able to handle operations like a
> long,lat position minus another one would yield a distance, but addition
> would not be supported. This functionality could be built into a subclass of
> the Dimensions container.
I don't think lat/long, or even Fahrenheit/Celsius are a good idea.
For one thing, it's a short step from there to general coordinate
system conversion (what about UTM? ECEF? do you want great circle
distances or direct line?), and then to conversion of tensor
quantities between coordinate systems, and down that road lies
madness. I think to be a tractable package this needs well-defined
boundaries, and the place I'd put those boundaries is at
multiplicative units. That's enough to be genuinely useful, and it's
small enough to be doable.
> Comments and criticism welcome.
How do you plan to handle radians and degrees? Radians should really
be no unit at all, but it would sometimes be nice to have them printed
(and sometimes not).
On a related topic, how are you going to handle ufuncs? Addition and
subtraction should require commensurable units, multiplication should
multiply the units, and I think all other standard ufuncs should
require something with no units. Well, except for mean, std, and var
maybe. And "pow" is tricky. And, well, you see what I'm getting at.
What about user-defined functions? It's worth having some kind of
decorator that enforces that a particular function acts on something
with no units, and maybe enforces particular units. How can users
conveniently write something like a function to add in quadrature?
Are you going to support fractional exponents in your units? (Note
that they probably need to be exact fractions to be sure they cancel
when they're supposed to.) How are you going to deal with CGS (in
common use in astronomy for some strange reason) and other
semi-"natural" units? In these systems some of the formulas actually
look different because units have been chosen to make constants go
away. This means that there are fewer basic units (for example in GR
one sometimes sets G=c=1 and converts everything - kilograms and
meters - to seconds); how do you handle conversion between one of
these systems and SI?
As was mentioned in a previous discussion of this issue on this list,
it's worth looking at how Frink handles this. I don't recommend
necessarily following Frink's approach, since it has become quite
complicated, but it's worth understanding the issues that pushed Frink
to its current size. Keeping this package simple will definitely
involve building in limitations.
More information about the Scipy-dev