# [Numpy-discussion] Optimizing mean(axis=0) on a 3D array

Travis Oliphant oliphant.travis at ieee.org
Sat Aug 26 05:26:32 CDT 2006

```Martin Spacek wrote:
> Hello,
>
> I'm a bit ignorant of optimization in numpy.
>
> I have a movie with 65535 32x32 frames stored in a 3D array of uint8
> with shape (65535, 32, 32). I load it from an open file f like this:
>
>  >>> import numpy as np
>  >>> data = np.fromfile(f, np.uint8, count=65535*32*32)
>  >>> data = data.reshape(65535, 32, 32)
>
> I'm picking several thousand frames more or less randomly from
> throughout the movie and finding the mean frame over those frames:
>
>  >>> meanframe = data[frameis].mean(axis=0)
>
> frameis is a 1D array of frame indices with no repeated values in it. If
> it has say 4000 indices in it, then the above line takes about 0.5 sec
> to complete on my system. I'm doing this for a large number of different
> frameis, some of which can have many more indices in them. All this
> takes many minutes to complete, so I'm looking for ways to speed it up.
>
> If I divide it into 2 steps:
>
>  >>> temp = data[frameis]
>  >>> meanframe = temp.mean(axis=0)
>
> and time it, I find the first step takes about 0.2 sec, and the second
> takes about 0.3 sec. So it's not just the mean() step, but also the
> indexing step that's taking some time.
>

If frameis is 1-D, then you should be able to use

temp = data.take(frameis,axis=0)

for the first step.   This can be quite a bit faster (and is a big
reason why take is still around).   There are several reasons for this
(one of which is that index checking is done over the entire list when
using indexing).

-Travis

```