[Numpy-discussion] List/location of consecutive integers (2)
Wed May 27 08:24:06 CDT 2009
Christopher Barker wrote:
> Andrea Gavana wrote:
>> I have tried the solutions proposed in the previous thread and it
>> looks like Chris' one is the fastest for my purposes.
> whoo hoo! What do I win? ;-)
>> Splitting the reading process between 4 processes will require the
>> exchange of 5-20 MB from the child processes to the main one: do you
>> think my script will benefit from using multiprocessing?
> If you are talking about multiprocessing to read the data in -- I don't
> think so -- that's probably IO bound anyway. You can't make your disks
> faster with multiple processors.
>> Should I try another approach?
> I don't know it will do anything for performance, but you might want to
> look at memory mapped arrays -- it's a very cool way to work with data
> files too big to want to bring into memory all at once.
Depending on your system and OS, I would agree with Chris that you are
most likely to be I/O bound. If so, you have to look at a different
approach to overcome that barrier.
If you are not I/O bound then you need to find out what is the limiting
your performance (like using Robert Kern's line_profiler
http://pypi.python.org/pypi/line_profiler/). If you find it CPU-bound then you might you gain benefits from multiple cpu's - of which has been addressed in multiple times on the list.
Cython is probably a very viable option for what you have described.
More information about the Numpy-discussion