[Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication
Wed Jun 15 19:09:46 CDT 2011
Den 15.06.2011 23:22, skrev Christopher Barker:
> It would also would be great if someone that actually understands this
> stuff could look at his code and explain why the slowdown occurs (hint,
Not sure I qualify, but I think I notice several potential problems in
the OP's multiprocessing/NumPy code:
"innerProductList = pool.map(myutil.numpy_inner_product, arrayList)"
1. Here we potentially have a case of false sharing and/or mutex
contention, as the work is too fine grained. pool.map does not do any
load balancing. If pool.map is to scale nicely, each work item must take
a substantial amount of time. I suspect this is the main issue.
2. There is also the question of when the process pool is spawned.
Though I haven't checked, I suspect it happens prior to calling
pool.map. But if it does not, this is a factor as well, particularly on
Windows (less so on Linux and Apple).
3. "arrayList" is serialised by pickling, which has a significan
overhead. It's not shared memory either, as the OP's code implies, but
the main thing is the slowness of cPickle.
"IPs = N.array(innerProductList)"
4. numpy.array is a very slow function. The benchmark should preferably
not include this overhead.
More information about the NumPy-Discussion