[SciPy-dev] design of a physical quantities package: seeking comments

Darren Dale dsdale24@gmail....
Sat Aug 2 20:39:49 CDT 2008


Hi Anne,

On Saturday 02 August 2008 4:23:43 pm Anne Archibald wrote:
> 2008/8/2 Darren Dale <dsdale24@gmail.com>:
> > I have been thinking about how to handle physical quantities by
> > subclassing ndarray and building on some of the ideas and code from
> > Charles Doutriaux python wrappers of udunits and Enthought's units
> > package.
>
> This sounds like a very handy tool, but you want to be careful to keep
> it tractable.

I don't plan on investing a big chunk of time on this. Hopefully I can get 
something together that will be useful and extendable, so if someone wants to 
add new features at a later time, there hopefully will be a simple but 
flexible-enough foundation to do so.

> > I would like to share my current thinking, and would appreciate some
> > feedback at these early stages so I can get the basic design right.
> >
> > The proposed Quantity object would be an ndarray subclass with an
> > additional attribute and property:
> >
> > * Quantity has a private attribute, a Dimensions container, which
> > contains zero or more Dimension objects, each having associated units (a
> > string) and a power (a number). The container object would be equipped
> > with the various __add__, __sub__, __mul__, etc. Before performing the
> > operations on the ndarray values, the operation would be performed with
> > the Dimensions containers, either updating self's Dimensions and yielding
> > the conversion factors required to scale the other quantity's values to
> > perform the ndarray operation, or raising an exception because the two
> > dimensionalities are not commensurate for the particular operation. I
> > think this container approach is necessary in order to allow scaling of
> > each Dimension's units individually, simplifying operations like 11 ft mA
> > / ns * 1 hour = many ft mA.
>
> I think it's a good idea to try to keep things in the units they are
> provided in; this should reduce the occurrences of unexpected
> overflows when (for example) cubing a distance in megaparsecs (put
> this in centimetres and use single precision and you might overflow).
> But this does mean you need to decide, when adding (say) a distance in
> feet to a distance in metres, which unit the result should be in.

Yes, a decision will have to be made as to whether A*B+C will yield a result 
in units of A or C. I am not worried at this point about overflows or loss of 
precision when attempting to convert a quantity that is some integer dtype.

> Users will presumably also want some kind of unit normalization
> function that just converts its input to SI. You will also have to
> decide at what point simplifications occur - do ft/m immediately get
> converted as soon as they are produced?

I figured this would probably be the first requested feature. I don't plan on 
addressing it at this point, but I suppose some mechanism could be added to 
set a default system.

> What about units like pc/cm^3 
> (actually in very common use in radio astronomy)? How do you preserve
> pc/m^3 without getting abominations like kg m^2 s^2/kg m s^2?

I'm not familiar with the issue here (how do you go from length/length^3 to 
length, and where did the mass and time units come from?). To begin with, I 
plan on attacking this problem the same way one would do dimensional 
analysis, converting everything to the basic dimensions of mass, length, 
time, charge (or current) and temperature, which I think is the way 
enthought.units handles it as well. pc/m^3 would be converted to 1/m^2 or 
1/pc^2. Hopefully it should be possible for someone to write a specialized 
Dimensions object that preserves a compound unit by performing a different 
dimensional analysis. Or perhaps some mechanism could be put in place to 
format the units string representation according some predefined rules and 
user-defined context. These are probably issues to be addressed at a later 
time.

> How are users going to specify units? Using the existing packages, I
> found it useful to make the units into variables:
> kg = Unit("kg")
> so that I could then do
> wt = 10*kg
> and the error checking would behave nicely.

I am hoping that units can either be set in the constructor or provided after 
the fact using multiplication, similar to your example.

> > * Quantity has a public units property, providing a view into the
> > object's dimensions and the ability to change from one set of units to
> > another. q.units would return the Dimensions instance, whose __repr__
> > would be dynamically constructed from each dimension's units and power
> > attributes. The setter would have some limitations by design. For example
> > if q has units of kg m / s^2 and you do q.units='ft', then q.units would
> > return kg ft /s^2.
>
> Hmm. This kind of guessing is likely to trip people up. At least the
> default should be "convert to exactly the unit I specified". 

I disagree. It is not physically possible to convert kg m / s^2 to ft. It 
should either convert the units of the appropriate dimension or raise an 
error. Personally, I think the former would be more useful. If one wants the 
former, perhaps the set_units method could provide a pedantic kwarg that 
would attempt a complete conversion and raise on error.

> After 
> all, the point of using units is that they catch many mathematical
> errors, and the sooner they are caught the better. 

I agree. I don't see how the proposed behavior would be problematic, there is 
no guessing involved. If you specify a unit of length, the lengths will be 
expressed in that unit. Maybe you could provide an example showing how 
confusion would arise.

> There is something 
> to be said for a "unit globbing" system ("convert all occurrences of
> metres to feet but leave everything else alone") and for conversion to
> predefined unit systems (MKS, CGS, "Imperial", metric with
> prefixes...) but I don't think it should be the default.
>
> > I think the Dimensions container may provide enough abstraction to handle
> > more unusual operations if someone wanted to add them. Robert Kern
> > suggested a few years back (see
> > http://aspn.activestate.com/ASPN/Mail/Message/scipy-user/2538532) that a
> > good physical quanitities system should be able to handle operations like
> > a long,lat position minus another one would yield a distance, but
> > addition would not be supported. This functionality could be built into a
> > subclass of the Dimensions container.
>
> I don't think lat/long, or even Fahrenheit/Celsius are a good idea.
> For one thing, it's a short step from there to general coordinate
> system conversion (what about UTM? ECEF? do you want great circle
> distances or direct line?), and then to conversion of tensor
> quantities between coordinate systems, and down that road lies
> madness.

I have a copy of Levi-Civita's "The Absolute Differential Calculus" sitting 
near the bottom of my stack of books to read and pretend I have understood.

> I think to be a tractable package this needs well-defined 
> boundaries, and the place I'd put those boundaries is at
> multiplicative units. That's enough to be genuinely useful, and it's
> small enough to be doable.

I agree, I was just hoping that someone who had specific additional features 
in mind would speak up and comment on whether the proposed abstractions are 
sufficient and if not, offer their own suggestions.

> > Comments and criticism welcome.
>
> How do you plan to handle radians and degrees? Radians should really
> be no unit at all, but it would sometimes be nice to have them printed
> (and sometimes not).

I guess if you specify angles, you will get angles.

> On a related topic, how are you going to handle ufuncs? Addition and
> subtraction should require commensurable units, multiplication should
> multiply the units, and I think all other standard ufuncs should
> require something with no units. Well, except for mean, std, and var
> maybe. And "pow" is tricky. And, well, you see what I'm getting at.

Well, I already laid out a strategy for dealing with multiplication and 
addition, but I am really not that familiar with ufuncs and there are 
probably some problems lurking that I am not aware of. Maybe I will have to 
rely on object methods to wrap the incompatible ufuncs and return Quantities 
with the appropriate units.

> What about user-defined functions? It's worth having some kind of
> decorator that enforces that a particular function acts on something
> with no units, and maybe enforces particular units. 

Could you give an example? I don't follow.

> How can users 
> conveniently write something like a function to add in quadrature?

If multiplication, addition, and power are supported, shouldnt this be 
transparent?

> Are you going to support fractional exponents in your units? (Note
> that they probably need to be exact fractions to be sure they cancel
> when they're supposed to.) 

Yes, I think this is necessary.

> How are you going to deal with CGS (in 
> common use in astronomy for some strange reason) and other
> semi-"natural" units? In these systems some of the formulas actually
> look different because units have been chosen to make constants go
> away. This means that there are fewer basic units (for example in GR
> one sometimes sets G=c=1 and converts everything - kilograms and
> meters - to seconds); how do you handle conversion between one of
> these systems and SI?

I havent considered it. Like you said, it is better to keep the problem 
tractable.

> As was mentioned in a previous discussion of this issue on this list,
> it's worth looking at how Frink handles this. I don't recommend
> necessarily following Frink's approach, since it has become quite
> complicated, but it's worth understanding the issues that pushed Frink
> to its current size. Keeping this package simple will definitely
> involve building in limitations.

It would be nice to do so, but I don't think Frink's sources are available.

I would like to make clear: my concern is to get the abstractions right so it 
will be flexible enough that others can build on it to provide their desired 
functionality. If anyone has ideas on how the abstractions need to be 
improved, I would like to here them.

Thanks for the feedback,
Darren


More information about the Scipy-dev mailing list