[SciPy-user] Dealing with Large Data Sets
Sun May 11 02:38:50 CDT 2008
Anne Archibald wrote:
> 2008/5/10 Damian Eads <firstname.lastname@example.org>:
>> Damian Eads wrote:
>>> which perform the operations in an in-place fashion. If data.sum(axis =
>>> 2) is large, preallocate an array to store the sum,
>>> # for summing over columns
>>> sum_result = numpy.zeros(data.shape[0:2])
>> I meant to include
>> data **= 2
>> np.sum(data, axis=2, out=sum_result)
>> which does an in-place, element-wise exponentiate, sums over the
>> columns, and stores the result in sum_result.
> What is the advantage to preallocating the result rather than letting
> sum() do the allocation?
If the computation is repeated millions of times and the sum array is
large (100s of MBs), then it is certainly advantageous to allocate the
sum array once than for each computation.
More information about the SciPy-user