# [Numpy-discussion] Assigning complex value to real array

Andrew P. Mullhaupt doc@zen-pharaohs....
Thu Oct 7 21:20:23 CDT 2010

```  On 10/7/2010 6:52 PM, Matthew Brett wrote:
> Hi,
>
> On Thu, Oct 7, 2010 at 3:47 PM, Andrew P. Mullhaupt
> <doc@zen-pharaohs.com>  wrote:
>>> Most machines now
>>> and in the future are not going to choke on these issues (for a variety
>>> of reasons).
> 'Talk is cheap, show me the code' .

Yes, let's.

First we want to initialize memory mapped arrays which will be used for
the variables. This python script does that:

from numpy import *

N = 819200

X = random.uniform(0, 1, 2*N)
Y = random.uniform(0, 1, 2*N)
Z = random.uniform(0, 1, 2*N)

p = random.uniform(0, 1, N)
q = random.uniform(0, 1, N)
r = random.uniform(0, 1, N)
s = random.uniform(0, 1, N)
w = random.uniform(0, 1, N)
z = random.uniform(0, 1, N)

save('X', X)
save('Y', Y)
save('Z', Z)
save('p', p)
save('q', q)
save('r', r)
save('s', s)
save('w', w)
save('z', z)

Note that we have two arrays of length 2*N representing complex arrays
with interleaved real and imaginary parts, and we have four arrays of
length N representing the segregated real and complex parts.

We also create arrays for the sums; these are not sparse (since we are
concerned with the case where there actually are imaginary parts that
would require memory bandwidth).

Since we want to check the memory bandwidth we need a test which
essentially shovels these arrays through memory very quickly. I have
chosen to produce a result to ensure that everything actually gets done
- although this is probably not strictly necessary, why not.

First the interleaved version:

from numpy import *

N = Z.size / 2

its = int(8192*100000 / N)

for k in range(its) :
Z[:] = X + Y

print Z.max()

Then the segregated version:

from numpy import *

N = s.size

its = int(8192*100000 / N)

for k in range(its) :
w[:] = p + r
z[:] = s + q

print maximum(w.max(), z.max())

Note that the segregated version does not "cheat" by computing  p + q
and  r + s  which presumably span less address space. In fact, we would
not expect that timing to be too much different.

Now lets conduct the interleaved timing:

andrew@flyer:/media/fraid/andrew/junk\$ time python interleave.py
1.99960173329

real    0m6.438s
user    0m6.400s
sys    0m0.020s

and the segregated one:

andrew@flyer:/media/fraid/andrew/junk\$ time python segregate.py
1.99858982651

real    0m6.198s
user    0m6.140s
sys    0m0.050s

Well OK, I would describe this as pretty close. Oddly enough the
segregated version is faster.

Another pair of runs with larger files:

andrew@flyer:/media/fraid/andrew/junk\$ time python interleave.py
1.99995091618

real    0m11.480s
user    0m6.070s
sys    0m4.400s

andrew@flyer:/media/fraid/andrew/junk\$ time python segregate.py
1.99989270428

real    0m12.865s
user    0m7.250s
sys    0m4.330s

Here we end up with another close run, but the segregated  version is
slower.

So yeah, here is code you can use to see if your machine chokes on the
old sort of issues (which were very common in early RISC workstation
architectures - that awful TLB on the RS/6000 series springs immediately
to mind).

You can strobe through file sizes if you want to try and tickle one of
those weirder creases in memory architecture on your machine, but you
will probably not find anything that blatant. People who design
computers have been through this rodeo a couple times.

Best regards,
Andrew Mullhaupt
```