[SciPy-user] How to start with SciPy and NumPy

Sturla Molden sturla@molden...
Sun Jan 25 09:40:53 CST 2009


> On Sun, Jan 25, 2009 at 8:17 PM, Vicent <vginer@gmail.com> wrote:

>> (2) What about lists with different typed items within them?
>
> Numpy arrays - and generally arrays - fundamentally rely on the
> assumption of the same type for every item. A lot of the performances
> of array comes from this assumption (it means you can access any item
> randomly without the need to traverse any other item first, etc...).

Just to clear up a common misunderstanding:

Pythons lists are also implemented as arrays, with the append to the back
being amortized to O(1). This means that Python allocates some empty space
at the end, proportional to the size of the list. Thus, every append does
not need to invoke realloc, and the complexity becomes O(1) on average.
Python's dict and set are also amortized to O(1). And the
collections.deque has amortized to O(1) when appending to both ends.

A python list and deque looks like an array of pointers in C, and thus
supports random access.

The performance of arrays over linked lists comes mainly from cache
coherency, not random access. Most use of these containers are sequential.

It is often claimed that Python lists are slower than linked lists for
appends in the middle. This is not true: While the "insert in the middle"
is O(1) with a linked list and O(N) with Python lists, reaching the middle
item is O(1) with Python lists and O(N) with linked lists. For an append
in the middle you need to to both.

Using linked lists is a bad habit of many programmers, particularly from
the Java community, because introductory CS textbooks explain linked lists
without explaining their weaknesses. I'd estimate that >75% of all
programmers in this world does not know that list and tree structures can
be implemented more efficiently using arrays instead of chained pointers.


Sturla Molden








More information about the SciPy-user mailing list