[SciPy-User] equivalent of tolist().index(entry) for numpy 1d array of strings
Keith Goodman
kwgoodman@gmail....
Mon Dec 21 20:27:05 CST 2009
On Mon, Dec 21, 2009 at 6:09 PM, Ryan Krauss <ryanlists@gmail.com> wrote:
> I am still open to more elegant solutions, but it seems like my
> concerns about .tolist() being inefficient are unfounded (this may be
> an indicator that I don't understand the inner workings of numpy very
> well).
>
> Here is my test:
>
> t1 = time.time()
> index1 = where(self.md5sum==photo.md5sum)[0][0]
> t2 = time.time()
> index2 = mysearch(self.md5sum, photo.md5sum)
> t3 = time.time()
> index3 = self.md5sum.tolist().index(photo.md5sum)
> t4 = time.time()
If you are using ipython then it is handly, and more accurate, to use
timeit. At the ipython prompt try:
timeit where(self.md5sum==photo.md5sum)[0][0]
>
> All 3 approaches lead to the same result. Here are my timing results:
> t2-t1=4.81605529785e-05
> t3-t2=4.98294830322e-05
> t4-t3=2.00271606445e-05
>
> def mysearch(arrayin, element):
> bool_vect = where(arrayin==element)[0]
> assert(len(bool_vect)==1), 'Did not find exactly 1 match for ' +
> str(element)
> return bool_vect[0]
If element is not in arrayin then mysearch will crash. Same for .index.
>
> Now, for this test, the arrays didn't have very many elements (10 ish).
>
> FWIW,
>
> Ryan
>
> On Mon, Dec 21, 2009 at 7:53 PM, Ryan Krauss <ryanlists@gmail.com> wrote:
>> I wrote some code to work with csv spreadsheet files by reading the
>> columns into lists, but I need to rework the code to work with numpy
>> 1d arrays of strings rather than lists. I need to search one of these
>> columns/arrays. What is the best way to find the index for the
>> element that matches a certain string (or maybe just the first element
>> to match such a string)?
>>
>> With the columns as lists, I was doing
>> index = mylist.index(entry)
>>
>> So, I could obviously do
>> index = mylist.tolist().index(entry)
>>
>> but I don't know if that would be slower or clumsier than something like
>> bool_vect = where(mylist==entry)[0]
>> index = bool_vect[0]
>>
>> or just
>>
>> index = where(mylist==entry)[0][0]
>>
>> Any thoughts? Is there an easier way?
>>
>> Thanks,
>>
>> Ryan
>>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list