[Numpy-discussion] genfromtxt view with object dtype

Brent Pedersen bpederse@gmail....
Wed Feb 4 15:03:16 CST 2009


On Wed, Feb 4, 2009 at 9:36 AM, Pierre GM <pgmdevlist@gmail.com> wrote:
>
> On Feb 4, 2009, at 12:09 PM, Brent Pedersen wrote:
>
>> hi, i am using genfromtxt, with a dtype like this:
>> [('seqid', '|S24'), ('source', '|S16'), ('type', '|S16'), ('start',
>> '<i4'), ('end', '<i4'), ('score', '<f8'), ('strand', '|S1'), ('phase',
>> '<i4'), ('attrs', '|O4')]
>
> Brent,
> Please post a simple, self-contained example with a few lines of the
> file you want to load.
>
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>

hi pierre, here is an example.
thanks,
-brent

######################

import numpy as np
from cStringIO import StringIO

gffstr = """\
##gff-version 3
1\tucb\tgene\t2234602\t2234702\t.\t-\t.\tID=grape_1_2234602_2234702;match=EVM_prediction_supercontig_1.248,EVM_prediction_supercontig_1.248.mRNA
1\tucb\tgene\t2300292\t2302123\t.\t+\t.\tID=grape_1_2300292_2302123;match=EVM_prediction_supercontig_244.8
1\tucb\tgene\t2303615\t2303967\t.\t+\t.\tID=grape_1_2303615_2303967;match=EVM_prediction_supercontig_244.8
1\tucb\tgene\t2303616\t2303966\t.\t+\t.\tParent=grape_1_2303615_2303967
1\tucb\tgene\t3596400\t3596503\t.\t-\t.\tID=grape_1_3596400_3596503;match=evm.TU.supercontig_167.27
1\tucb\tgene\t3600651\t3600977\t.\t-\t.\tmatch=evm.model.supercontig_1217.1,evm.model.supercontig_1217.1.mRNA
"""

dtype = {'names' :
                  ('seqid', 'source', 'type', 'start', 'end',
                    'score', 'strand', 'phase', 'attrs') ,
        'formats':
                  ['S24', 'S16', 'S16', 'i4', 'i4', 'f8',
                      'S1', 'i4', 'S128']}

#OK with S128 for attrs
print np.genfromtxt(StringIO(gffstr), dtype = dtype)



def _attr(kvstr):
    pairs = [kv.split("=") for kv in kvstr.split(";")]
    return dict(pairs)

# change S128 to object to have col attrs as dictionary
dtype['formats'][-1] = 'O'
converters = {8: _attr }
#NOT OK
print np.genfromtxt(StringIO(gffstr), dtype = dtype, converters=converters)


More information about the Numpy-discussion mailing list