[Numpy-discussion] draft enum NEP

Nathaniel Smith njs@pobox....
Fri Mar 16 10:07:25 CDT 2012


On Mar 16, 2012 1:02 AM, "Stéfan van der Walt" <stefan
<stefan@sun.ac.za>@<stefan@sun.ac.za>
sun.ac.za <stefan@sun.ac.za>> wrote:
>
> On Thu, Mar 15, 2012 at 4:02 PM, Nathaniel Smith <njs <njs@pobox.com>@<njs@pobox.com>
pobox.com <njs@pobox.com>> wrote:
> > I'm not sure what it would even mean to treat this kind of data as
> > "flags", since you can't take the bitwise-or of two strings...
>
> This makes a more sense outside of ndarrays, where you would do something
like:
>
> enum FLAG0 = 1, FLAG1 = 2, FLAG2 = 4
> do_something(data, mode=FLAG0 & FLAG2)
>
> The enum is therefore just a handle for its numerical value.  While it
> may not be that useful in an array, I think Mark was just pointing out
> that there may be other similar use cases, such as enumerating from 0
> to N-1, or in reverse from N-1 down to 0, or in steps of 2, or in
> powers of 2, etc.

Right, there may be. But are there? That's the question :-)

It looks like R doesn't support anything except 1, ..., N numbering.
There's really no reason it would either, since in their design the
underlying integer values are almost entirely hidden from users. You could
get at them if you wanted, but I bet less than 1% of users are even aware
that factors and integers have anything to do with each other. Factors are
just documented to be a way to store an array of strings drawn from a
limited ordered list. (The ordering is important for things like polynomial
coding and treatment versus baseline coding.)

HDF5 supports arbitrary symbol<->integer mappings.

0, ..., N-1 coding makes the common problem of creating an indicator matrix
very convenient:
ind = np.zeros((enum_a.length, len(enum_.dtype.levels)), dtype=bool)
ind[:, enum_a.view(dtype=np.int32)] = True

But we can't restrict ourselves to only this coding if we want
compatibility with HDF5 or R (because R is 1-based). So I guess supporting
arbitrary mappings is worth it - though I doubt this flexibility will be
used much. I'm curious if anyone can think of a reason they'd use it
besides interoperability.

Cheers,
- Nathaniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20120316/499ebe24/attachment.html 


More information about the NumPy-Discussion mailing list