[Numpy-discussion] genfromtxt view with object dtype

Brent Pedersen bpederse@gmail....
Wed Feb 4 23:22:31 CST 2009


On Wed, Feb 4, 2009 at 8:51 PM, Pierre GM <pgmdevlist@gmail.com> wrote:
> OK, Brent, try r6341.
> I fixed genfromtxt for cases like yours (explicit dtype involving a
> np.object).
> Note that the fix won't work if the dtype is nested and involves
> np.objects (as we would hit the pb of renaming fields we observed...).
> Let me know how it goes.
> P.
>

that fixes it. thanks again pierre!
-b




> On Feb 4, 2009, at 4:03 PM, Brent Pedersen wrote:
>
>> On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM <pgmdevlist@gmail.com>
>> wrote:
>>>
>>> On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote:
>>>
>>>> hi, i am using genfromtxt, with a dtype like this:
>>>> [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start',
>>>> '<i4'), ('end', '<i4'), ('score', '<f8'), ('strand', '|S1'),
>>>> ('phase',
>>>> '<i4'), ('attrs', '|O4')]
>>>
>>> Brent,
>>> Please post a simple, self-contained example with a few lines of the
>>> file you want to load.
>>>
>>> _______________________________________________
>>> Numpy-discussion mailing list
>>> Numpy-discussion@scipy.org
>>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>> hi pierre, here is an example.
>> thanks,
>> -brent
>>
>> ######################
>>
>> import numpy as np
>> from cStringIO import StringIO
>>
>> gffstr = """\
>> ##gff-version 3
>> 1\tucb\tgene\t2234602\t2234702\t.\t-\t.
>> \tID
>> =
>> grape_1_2234602_2234702
>> ;match
>> =
>> EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA
>> 1\tucb\tgene\t2300292\t2302123\t.\t+\t.
>> \tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8
>> 1\tucb\tgene\t2303615\t2303967\t.\t+\t.
>> \tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8
>> 1\tucb\tgene\t2303616\t2303966\t.\t+\t.
>> \tParent=grape_1_2303615_2303967
>> 1\tucb\tgene\t3596400\t3596503\t.\t-\t.
>> \tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27
>> 1\tucb\tgene\t3600651\t3600977\t.\t-\t.
>> \tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA
>> """
>>
>> dtype = {'names' :
>>                  ('seqid', 'source', 'type', 'start', 'end',
>>                    'score', 'strand', 'phase', 'attrs') ,
>>        'formats':
>>                  ['S24', 'S16', 'S16', 'i4', 'i4', 'f8',
>>                      'S1', 'i4', 'S128']}
>>
>> #OK with S128 for attrs
>> print np.genfromtxt(StringIO(gffstr), dtype = dtype)
>>
>>
>>
>> def _attr(kvstr):
>>    pairs = [kv.split("=") for kv in kvstr.split(";")]
>>    return dict(pairs)
>>
>> # change S128 to object to have col attrs as dictionary
>> dtype['formats'][-1] = 'O'
>> converters = {8: _attr }
>> #NOT OK
>> print np.genfromtxt(StringIO(gffstr), dtype = dtype,
>> converters=converters)
>> _______________________________________________
>> Numpy-discussion mailing list
>> Numpy-discussion@scipy.org
>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>


More information about the Numpy-discussion mailing list