[AstroPy] extracting column names from VOtable

Michael Droettboom mdroe@stsci....
Mon Feb 25 11:57:32 CST 2013


I admit, this is horribly confusing because the term "name" is used in 
VOTable and Numpy to represent different concepts.

In VOTable, ID is guaranteed to be unique, but is not required. Names 
are not guaranteed to be unique, but are required.

In numpy, names are required to be unique and are required. titles are 
not required, and are not required to be unique.

So a name is not a name. The conceptual mapping is really that vo name 
== numpy title and vo ID == numpy name.

Admittedly, this deserves better documentation in astropy, but I think 
this is actually the correct mapping of concepts.

I have proposed to the VOTable committee to make IDs required, which 
would make this much more straightforward, but I have not heard a 
determination about this, and it appears that it will at least not make 
it into VOTable 1.3 -- we might have to wait for VOTable 1.4 for that if 
at all.

Mike

On 02/25/2013 12:31 PM, Erik Tollerud wrote:
> See https://github.com/astropy/astropy/issues/819 for the issue I just
> created to address this topic.
>
> On Mon, Feb 25, 2013 at 12:23 PM, Erik Tollerud <erik.tollerud@gmail.com> wrote:
>> Aha, I understand what's happening now.  The fields that have both
>> "ID" and "name" get *both* attached to the dtype, but `dtype.names`
>> only gives you the ID.  For example, if you do ``dt.descr[0]``, you
>> get (('CIG Number', 'col1'), '|O8'), but ``dt.names[0]`` just gives
>> 'col1'.  So the way to get what you want out of the dtype is probably
>> like this:
>> [des[0][0] for des in dtype_a.descr]
>> Although that's certainly rather awkward.
>>
>> I *think* this is easily fixable, and I agree with you that that's
>> surprising behavior.  This might be considered either a bug or a
>> feature, request, but either way I'll put it in the issue tracker so
>> it can hopefully get into the next version.
>>
>> Thanks!
>>
>> On Mon, Feb 25, 2013 at 11:45 AM, Susana Sanchez <susanasanche@gmail.com> wrote:
>>> 2013/2/25 Erik Tollerud <erik.tollerud@gmail.com>:
>>>> Ah, I see that - I didn't read to the bottom of your original post!
>>>>
>>>> Odd, I would have thought that would work.  But as loks as the [f.name
>>>> for f in table.fields] method works, I guess it's fine.
>>>>
>>>> Is the VOTable that's doing this somewhere publicly available?  It
>>>> might be useful if someone wants to check if this is a bug or
>>>> something...
>>>
>>> Yes, it is publicly available in the VO service "AMIGA catalogue",
>>> http://amiga.iaa.es/amigasearch. Attached you can find the votable
>>> that I was using. I got it from this VO service through TOPCAT.
>>>
>>>
>>>> On Mon, Feb 25, 2013 at 11:18 AM, Susana Sanchez <susanasanche@gmail.com> wrote:
>>>>> Thanks Erik,
>>>>>
>>>>> The first way you say ([f.name for f in table.fields]) it is just what
>>>>> I am looking for, but the alternative way, using the dtype array
>>>>> (table.array.dtype.names), does not give the same things. In those
>>>>> cases when the votable fields contain 'ID' and 'name', the
>>>>> table.array.dtype.names gives the values in 'ID' but not in 'name'.
>>>>>
>>>>>
>>>>>
>>>>> 2013/2/25 Erik Tollerud <erik.tollerud@gmail.com>:
>>>>>> This is probably the easiest way:
>>>>>>
>>>>>> [f.name for f in table.fields]
>>>>>> ``table.fields`` is a list of `astropy.io.votable.tree.Field` objects, and
>>>>>> those objects have all the information about the columns (including
>>>>>> things like units).
>>>>>>
>>>>>> An alternative is to use `array` to get the numpy array from the
>>>>>> votable, and the `dtype` has the column names.  I.e.,
>>>>>> ``table.array.dtype.names`` should give the same thing.  That won't
>>>>>> include extra VO information like units and such, though.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 25, 2013 at 6:26 AM, Susana Sanchez <susanasanche@gmail.com> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Probably this is a newbie question, but how can I extract the names of
>>>>>>> the columns from the VOtable with the Astropy library?
>>>>>>>
>>>>>>> I want to show the VOTable data in a nice way, using the Qt library,
>>>>>>> so I need to extract the names of the columns and the data from a
>>>>>>> VOTable. I have tried it using the Numpy record array associated to
>>>>>>> the votable, see code below.  But I have problems when the votable
>>>>>>> fields have 'ID' and also 'name'. I am wondering if there is a better
>>>>>>> way to find the column names.
>>>>>>>
>>>>>>> I would be very gratefully, If anyone can help me or give me any hint.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Susana.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> table = parse_single_table("/home/susana/Documents/examples/tables/cig22.xml",pedantic=False)
>>>>>>> data = table.array
>>>>>>> dtype_a=data.dtype
>>>>>>> column_names=[]
>>>>>>> for k,v in dtype_a.fields.iteritems():
>>>>>>>      column_names.append(k)
>>>>>>> _______________________________________________
>>>>>>> AstroPy mailing list
>>>>>>> AstroPy@scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/astropy
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Erik
>>>>
>>>>
>>>> --
>>>> Erik
>>
>>
>> --
>> Erik
>
>
> --
> Erik
> _______________________________________________
> AstroPy mailing list
> AstroPy@scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/astropy/attachments/20130225/26a6cf68/attachment.html 


More information about the AstroPy mailing list