[Numpy-discussion] combinations anomaly

Charles R Harris charlesr.harris@gmail....
Sat Sep 22 20:59:02 CDT 2007


On 9/22/07, Charles R Harris <charlesr.harris@gmail.com> wrote:
>
>
>
> On 9/22/07, Alan G Isaac <aisaac@american.edu> wrote:
> >
> > Charles harris posted a generator function for generating
> > combinations (based on Knuth's fascicle).  I get the
> > expected results by iterating over the resulting generator,
> > but not if I let ``list`` do that for me.  What is more,
> > changing ``arange`` to ``range`` in the code eliminates
> > this anomaly.
> >
> > What am I failing to understand here?
> >
> > Thank you,
> > Alan Isaac
>
>
> There are a couple of potential problems if you want to make a list.
> Because an array view is returned, and the data is updated in the loop, all
> the views will end up with the same content. I used arrays and views for
> speed. To fix that, you need to return a copy, i.e., yield c[:t].copy().
> That way you will end up with a list of arrays. If you do yield list(c[:t]),
> you will get a list of lists. Or, you can do as you did and just use range
> instead of arange because a slice of a list returns a copy. I admit I'm
> curious about the speed of the two approaches, lists may be faster than
> arrays. Lets see.... combination returns array copies, combinaion1 uses
> range.
>
> In [7]: time for i in combination(100,3) : pass
> CPU times: user 0.89 s, sys: 0.07 s, total: 0.96 s
> Wall time: 0.96
>
> In [8]: time for i in combination1(100,3) : pass
> CPU times: user 0.17 s, sys: 0.01 s, total: 0.18 s
> Wall time: 0.18
>
> Wow, massive speed up using lists, almost as fast as nested loops. I
> expect lists benefit from faster indexing and faster creation. I think your
> range fix is the way to go. Things slow down a bit for the full list
> treatment, but not too much:
>
> In [13]: time a = list(combination1(100,3))
> CPU times: user 0.26 s, sys: 0.00 s, total: 0.27 s
> Wall time: 0.27
>
> In [14]: time a = [i for i in combination1(100,3)]
> CPU times: user 0.35 s, sys: 0.01 s, total: 0.36 s
> Wall time: 0.36
>

It's even faster in C++

In [1]: from _combination import combination

In [2]: time a = combination(100,3)
CPU times: user 0.09 s, sys: 0.01 s, total: 0.09 s
Wall time: 0.09

That's for the whole nested list. Arrays would probably be faster in this
case because I am calling the python functions to add objects to the lists.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/numpy-discussion/attachments/20070922/75a4024c/attachment.html 


More information about the Numpy-discussion mailing list