[SciPy-dev] design of a physical quantities package: seeking comments

Anne Archibald peridot.faceted@gmail....
Sun Aug 3 02:46:21 CDT 2008

2008/8/2 Darren Dale <dsdale24@gmail.com>:

> On Saturday 02 August 2008 4:23:43 pm Anne Archibald wrote:
>> 2008/8/2 Darren Dale <dsdale24@gmail.com>:
>> I think it's a good idea to try to keep things in the units they are
>> provided in; this should reduce the occurrences of unexpected
>> overflows when (for example) cubing a distance in megaparsecs (put
>> this in centimetres and use single precision and you might overflow).
>> But this does mean you need to decide, when adding (say) a distance in
>> feet to a distance in metres, which unit the result should be in.
> Yes, a decision will have to be made as to whether A*B+C will yield a result
> in units of A or C. I am not worried at this point about overflows or loss of
> precision when attempting to convert a quantity that is some integer dtype.

I don't think it's worth worrying about integers. But if you take,
say, 10 megaparsecs, and express it in metres, you get 3e23 m. For the
volume of a cube 10 Mpc on a side, you get about 3e70 - a number too
big to fit in a single-precision float. So the cosmologists will be
forced to use doubles if you internally represent everything using SI
units. For the same reason, you may want to make the conversion rules
very clear. ("Always convert to the left-hand unit" would work.)

>> What about units like pc/cm^3
>> (actually in very common use in radio astronomy)? How do you preserve
>> pc/m^3 without getting abominations like kg m^2 s^2/kg m s^2?
> I'm not familiar with the issue here (how do you go from length/length^3 to
> length, and where did the mass and time units come from?). To begin with, I
> plan on attacking this problem the same way one would do dimensional
> analysis, converting everything to the basic dimensions of mass, length,
> time, charge (or current) and temperature, which I think is the way
> enthought.units handles it as well. pc/m^3 would be converted to 1/m^2 or
> 1/pc^2. Hopefully it should be possible for someone to write a specialized
> Dimensions object that preserves a compound unit by performing a different
> dimensional analysis. Or perhaps some mechanism could be put in place to
> format the units string representation according some predefined rules and
> user-defined context. These are probably issues to be addressed at a later
> time.

Er. That was two separate examples. The important point is that I
*don't* want 50 pc/cm^3 to be converted to 1.5e24 m^{-2}. The former
is the unit universally used in the literature. If I have to I can
circumvent the unit printing system by doing
print "%g pc/cm^3" % (DM/Unit("pc/cm^3"))
but this is a pain. (Overflow may also be an issue here.)

>> How are users going to specify units? Using the existing packages, I
>> found it useful to make the units into variables:
>> kg = Unit("kg")
>> so that I could then do
>> wt = 10*kg
>> and the error checking would behave nicely.
> I am hoping that units can either be set in the constructor or provided after
> the fact using multiplication, similar to your example.

We use many constructors for arrays, so you may find it easier to
provide only a constructor for an array scalar with given units, then
make available variables in all the usual units.

>> > * Quantity has a public units property, providing a view into the
>> > object's dimensions and the ability to change from one set of units to
>> > another. q.units would return the Dimensions instance, whose __repr__
>> > would be dynamically constructed from each dimension's units and power
>> > attributes. The setter would have some limitations by design. For example
>> > if q has units of kg m / s^2 and you do q.units='ft', then q.units would
>> > return kg ft /s^2.
>> Hmm. This kind of guessing is likely to trip people up. At least the
>> default should be "convert to exactly the unit I specified".
> I disagree. It is not physically possible to convert kg m / s^2 to ft. It
> should either convert the units of the appropriate dimension or raise an
> error. Personally, I think the former would be more useful. If one wants the
> former, perhaps the set_units method could provide a pedantic kwarg that
> would attempt a complete conversion and raise on error.
>> After
>> all, the point of using units is that they catch many mathematical
>> errors, and the sooner they are caught the better.
> I agree. I don't see how the proposed behavior would be problematic, there is
> no guessing involved. If you specify a unit of length, the lengths will be
> expressed in that unit. Maybe you could provide an example showing how
> confusion would arise.

Here's a simple example: I have a quantity A I think is in metres. I
write "A=convert(A,'ft')". My program continues, combining A with
various other quantities. At the end I obtain a meaningless number
with bizarre units and I have to trace back through my code to find
out that these weird units were attached to A, which in fact was in
units of energy density.

More, what happens when you have a quantity in Newton-metres (say) and
you ask it to convert to feet? Do you get Newton-feet, or do all the
occurrences of "feet" in the Newtons get converted?

What about conversions like ergs -> joules? Each of these is a
composite unit, so the code has to have some kind of procedure to find
the right number of powers of ergs to convert, leaving behind
whatever's left. Combine this with fractional exponents and you have a
real nightmare.

>> On a related topic, how are you going to handle ufuncs? Addition and
>> subtraction should require commensurable units, multiplication should
>> multiply the units, and I think all other standard ufuncs should
>> require something with no units. Well, except for mean, std, and var
>> maybe. And "pow" is tricky. And, well, you see what I'm getting at.
> Well, I already laid out a strategy for dealing with multiplication and
> addition, but I am really not that familiar with ufuncs and there are
> probably some problems lurking that I am not aware of. Maybe I will have to
> rely on object methods to wrap the incompatible ufuncs and return Quantities
> with the appropriate units.

I don't really have any idea how this is to be accomplished, but (for
example) the function "exp" really needs to check that its argument
has no units. It would also be nice to be able to use numpy.exp rather
than scipy.units.exp.

It might work to make unitless unit arrays be normal arrays, so that
the stock ufuncs worked on them. This does force you to reduce units
early, though.

>> What about user-defined functions? It's worth having some kind of
>> decorator that enforces that a particular function acts on something
>> with no units, and maybe enforces particular units.
> Could you give an example? I don't follow.

Well, suppose I have a function that, like "exp", only accepts
unitless quantities. Suppose in fact it's already written. It would be
nice to have a decorator so all I have to write is

def myfunc(x):

Similarly, if I want to implement the Planck black-body function, it
would be nice to be able to write

def B(T,nu):

> I would like to make clear: my concern is to get the abstractions right so it
> will be flexible enough that others can build on it to provide their desired
> functionality. If anyone has ideas on how the abstractions need to be
> improved, I would like to here them.

I think the hard part will be getting the ufuncs to behave correctly.
As you say, if you can get addition, multiplication, and fractional
powers working, you're pretty much there.

The key question for me is what units quantities are kept in
internally. Keep in mind that some arrays are large, so conversions of
quantities between units can be expensive. If I'm doing many
calculations with quantities expressed in pc/cm^3, I would hope that
they are not constantly being converted to m^{-2} on input and back to
pc/cm^3 on output. In fact I can imagine cases where both these
conversions happen inside a loop. That could get expensive.


More information about the Scipy-dev mailing list