[Numpy-discussion] List/location of consecutive integers (2)

Bruce Southey bsouthey@gmail....
Wed May 27 08:24:06 CDT 2009


Christopher Barker wrote:
> Andrea Gavana wrote:
>   
>>     I have tried the solutions proposed in the previous thread and it
>> looks like Chris' one is the fastest for my purposes.
>>     
>
> whoo hoo! What do I win? ;-)
>
>   
>> Splitting the reading process between 4 processes will require the
>> exchange of 5-20 MB from the child processes to the main one: do you
>> think my script will benefit from using multiprocessing?
>>     
>
> If you are talking about multiprocessing to read the data in -- I don't 
> think so -- that's probably IO bound anyway. You can't make your disks 
> faster with multiple processors.
>
>   
>> Should I try another approach?
>>     
>
> I don't know it will do anything for performance, but you might want to 
> look at memory mapped arrays -- it's a very cool way to work with data 
> files too big to want to bring into memory all at once.
>
> -Chris
>
>
>   

Depending on your system and OS, I would agree with Chris that you are 
most likely to be I/O bound.  If so, you have to look at a different 
approach to overcome that barrier.

If you are not I/O bound then you need to find out what is the limiting 
your performance (like using Robert Kern's line_profiler

http://pypi.python.org/pypi/line_profiler/). If you find it CPU-bound then you might you gain benefits from multiple cpu's - of which has been addressed in multiple times on the list.

Cython is probably a very viable option for what you have described. 

Bruce 



More information about the Numpy-discussion mailing list