[Numpy-discussion] recarray slow?
wheres pythonmonks
wherespythonmonks@gmail....
Wed Jul 21 15:57:31 CDT 2010
My code had a bug:
idx_by_name = dict((n,i) for i,n in enumerate(d.dtype.names))
On Wed, Jul 21, 2010 at 4:49 PM, Pauli Virtanen <pav@iki.fi> wrote:
> Wed, 21 Jul 2010 16:22:37 -0400, wheres pythonmonks wrote:
>> However: is there an automatic way to convert a named index to a
>> position?
>
> It's not really a named index -- it's a field name. Since the fields of
> an array element can be of different size, they cannot be referred to
> with an array index (in the sense that Numpy understands the concept).
>
>> What about looping over tuples of my recarray:
>>
>> for t in d:
>> date = t['Date']
>> ....
>>
>> I guess that the above does have to lookup 'Date' each time.
>
> As Pierre said, you can move the lookups outside the loop.
>
> for date in t['Date']:
> ...
>
> If you want to iterate over multiple fields, it may be best to use
> itertools.izip so that you unbox a single element at a time.
>
> However, I'd be quite surprised if the hash lookups would actually take a
> significant part of the run time:
>
> 1) Python dictionaries are ubiquitous and the implementation appears
> heavily optimized to be fast with strings.
>
> 2) The hash of a Python string is cached, and only computed only once.
>
> 3) String literals are interned, and represented by a single object only:
>
> >>> 'Date' is 'Date'
> True
>
> So when running the above Python code, the hash of 'Date' is computed
> exactly once.
>
> 4) For small dictionaries containing strings, such as the fields
> dictionary, I'd expect 1-3) to be dwarfed by the overhead involved
> in making Python function calls (PyArg_*) and interpreting the
> bytecode.
>
> So as the usual optimization mantra applies here: measure first :)
>
> Of course, if you measure and show that the expectations 1-4) are
> actually wrong, that's fine.
>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
More information about the NumPy-Discussion
mailing list