[Numpy-discussion] effectively computing variograms with numpy

Hanno Klemm klemm@phys.ethz...
Mon Jun 25 03:10:10 CDT 2007


Tim,

Thank you very much, the code does what's it expected to do.
Unfortunately the thing is still pretty slow on large data sets. I
will probably now look for ways to calculate the variogram from some
random samples of my data. Thanks for the observation regarding the
square array, that would have bitten me, later.

Again, thanks,

Hanno


Timothy Hochberg <tim.hochberg@ieee.org> said:

> ------=_Part_160363_13970610.1182535377481
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> OK, generally in code like this I leave the outer loops alone and try to
> vectorize just the inner loop.I have some ideas in this direction, but
> first, there seems to be some problems with the code at well. The
code looks
> like it is written to take non-square 'data' arrays. However,
> 
>        for i in range(data.shape[0]):
>            datasquared = (data - data[:,i])**2
> 
> This is looping over shape[0], but indexing on axis-1, which doesn't
work
> for non-square arrays. One suggestion is make a function to compute the
> variogram along a given axis and then calling it twice instead of
computing
> them both independently. Can you try the following code and see if this
> correctly implements a variogram? I don't have time to check that it
really
> implements a variogram, but I'm hoping it's close:
> 
>     def variogram(data, binsize, axis=-1):
>         data = data.swapaxes(-1, axis)
>         n = data.shape[-1]
>         resultsize = int(N.ceil(n / float(binsize)))
>         result = N.zeros([resultsize], data.dtype)
>         for i in range(resultsize):
>             j0 = max(i*binsize, 1)
>             j1 = min(j0+binsize, n)
>             denominator = 0
>             for j in range(j0, j1):
>                 d2 = (data[...,j:] - data[...,:-j])**2
>                 result[i] += d2.sum()
>                 denominator += N.prod(d2.shape)
>             result[i] /= denominator
>         return result
> 
> 
> 
> 
> On 6/22/07, Hanno Klemm <klemm@phys.ethz.ch > wrote:
> >
> > Tim,
> >
> > this is the best I could come up with until now:
> >
> >
> > import numpy as N
> >
> > def naive_variogram(data, binsize=100., stepsize=5.):
> >     """calculates variograms along the rows and columns of the given
> >     array which is supposed to contain equally spaced data with
> >     stepsize stepsize"""
> >
> >     # how many elements do fit in one bin?
> >
> >     binlength = int(binsize/stepsize)
> >
> >     #bins in x- and y- direction (+1 for the possible
> >     #elements larger than int(binsize/stepsize):
> >     x_bins = (data.shape[1])/binlength+1
> >     y_bins = (data.shape[0])/binlength+1
> >
> >     #arrays to store the reuslts in
> >     x_result = N.zeros(x_bins, dtype = float)
> >     y_result = N.zeros(y_bins, dtype = float)
> >
> >     #arrays to get teh denominators right
> >     x_denominators = N.zeros(x_bins, dtype=float)
> >     y_denominators = N.zeros(x_bins, dtype=float)
> >
> >     #what is the last index?
> >     xlast = data.shape[1]
> >     ylast = data.shape[0]
> >     for i in range(data.shape[0]):
> >         datasquared = (data - data[:,i])**2
> >         #number of bins to fill until the end of the array:
> >         numbins = 1 + (xlast - i)/binlength
> >         for j in range(numbins):
> >             x_result[j]+=\
> > datasquared[:,i+1+j*binlength:i+1+(j+1)*binlength].sum()
> >             x_denominators[j] +=\
> > datasquared[:,i+1+j*binlength:i+1+(j+1)*binlength].size
> >         try:
> >             #Is there a rest?
> >             x_result[numbins] +=
> > datasquared[:,i+1+numbins*binlength:].sum()
> >             x_denominators[numbins] +=
> > datasquared[:,i+1+numbins*binlength:].size
> >         except IndexError:
> >             pass
> >
> >     x_result /= x_denominators
> >
> >
> >     for i in range(data.shape[1]):
> >         datasquared = (data - data[i])**2
> >         #number of bins to fill until the end of the array:
> >         numbins = 1 + (ylast - i)/binlength
> >         #Fill the bins
> >         for j in range(numbins):
> >
> > y_result[j]+=datasquared[i+1+j*binlength:i+1+(j+1)*binlength].sum()
> >             y_denominators[j] +=
> > datasquared[i+1+j*binlength:i+1+(j+1)*binlength].size
> >         try:
> >             #Is there a rest?
> >             y_result[numbins] +=
> > datasquared[:,i+1+numbins*binlength:].sum()
> >             y_denominators[numbins] +=
> > datasquared[:,i+1+numbins*binlength:].size
> >         except IndexError:
> >             pass
> >
> >     y_result /= y_denominators
> >
> >     return x_result, y_result
> >
> > Thanks,
> > Hanno
> >
> >
> > Timothy Hochberg < tim.hochberg@ieee.org> said:
> >
> > > ------=_Part_157389_1558912.1182523880067
> > > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > > Content-Transfer-Encoding: 7bit
> > > Content-Disposition: inline
> > >
> > > On 6/22/07, Hanno Klemm <klemm@phys.ethz.ch> wrote:
> > > >
> > > >
> > > > Hi,
> > > >
> > > > I have an array which represents regularly spaced spatial
data. I now
> > > > would like to compute the (semi-)variogram, i.e.
> > > >
> > > > gamma(h) = 1/N(h) \sum_{i,j\in N(h)} (z_i - z_j)**2,
> > > >
> > > > where h is the (approximate) spatial difference between the
> > > > measurements z_i, and z_j, and N(h) is the number of
measurements with
> > > > distance h.
> > > >
> > > > However, I only want to calculate the thing along the rows and
> > > > columns. The naive approach involves two for loops and a lot of
> > > > searching, which becomes painfully slow on large data sets.
Are there
> > > > better implementations around in numpy/scipy or does anyone have a
> > > > good idea of how to do that more efficient? I looked around a
bit but
> > > > couldn't find anything.
> > >
> > >
> > > Can you send the naive code as well. Its often easier to see what's
> > going on
> > > with code in addition to the equations.
> > >
> > > Regards.
> > >
> > > -tim
> > >
> > >
> > >
> > > --
> > > .  __
> > > .   |-\
> > > .
> > > .  tim.hochberg@ieee.org
> > >
> > > ------=_Part_157389_1558912.1182523880067
> > > Content-Type: text/html; charset=ISO-8859-1
> > > Content-Transfer-Encoding: 7bit
> > > Content-Disposition: inline
> > >
> > > <br><br><div><span class="gmail_quote">On 6/22/07, <b
> > class="gmail_sendername">Hanno Klemm</b> &lt;<a
> > href="mailto:klemm@phys.ethz.ch ">klemm@phys.ethz.ch </a>&gt;
> > wrote:</span><blockquote class="gmail_quote" style="border-left: 1px
> > solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
1ex;">
> > > <br>Hi,<br><br>I have an array which represents regularly spaced
> > spatial data. I now<br>would like to compute the (semi-)variogram,
> > i.e.<br><br>gamma(h) = 1/N(h) \sum_{i,j\in N(h)} (z_i -
> > z_j)**2,<br><br>where h is the (approximate) spatial difference
> > between the
> > > <br>measurements z_i, and z_j, and N(h) is the number of
> > measurements with<br>distance h.<br><br>However, I only want to
> > calculate the thing along the rows and<br>columns. The naive approach
> > involves two for loops and a lot of
> > > <br>searching, which becomes painfully slow on large data sets. Are
> > there<br>better implementations around in numpy/scipy or does anyone
> > have a<br>good idea of how to do that more efficient? I looked around
> > a bit but<br>couldn&#39;t find anything.
> > > </blockquote><div><br>Can you send the naive code as well. Its often
> > easier to see what&#39;s going on with code in addition to the
> > equations.<br><br>Regards.<br><br>-tim<br><br></div></div><br
> > clear="all"><br>-- <br>.&nbsp;&nbsp;__
> > > <br>.&nbsp;&nbsp; |-\<br>.<br>.&nbsp;&nbsp;<a
> > href="mailto:tim.hochberg@ieee.org"> tim.hochberg@ieee.org</a>
> > >
> > > ------=_Part_157389_1558912.1182523880067--
> > >
> >
> >
> >
> > --
> > Hanno Klemm
> > klemm@phys.ethz.ch
> >
> >
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> 
> 
> 
> -- 
> .  __
> .   |-\
> .
> .  tim.hochberg@ieee.org
> 
> ------=_Part_160363_13970610.1182535377481
> Content-Type: text/html; charset=ISO-8859-1
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> <br>OK, generally in code like this I leave the outer loops alone
and try to vectorize just the inner loop.I have some ideas in this
direction, but first, there seems to be some problems with the code at
well. The code looks like it is written to take non-square
&#39;data&#39; arrays. However, 
> <br><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for i in
range(data.shape[0]):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
datasquared = (data - data[:,i])**2<br><div style="margin-left:
40px;"><br></div>This is looping over shape[0], but indexing on
axis-1, which doesn&#39;t work for non-square arrays. One suggestion
is make a function to compute the variogram along a given axis and
then calling it twice instead of computing them both independently.
Can you try the following code and see if this correctly implements a
variogram? I don&#39;t have time to check that it really implements a
variogram, but I&#39;m hoping it&#39;s close:
> <br><br>&nbsp;&nbsp;&nbsp; def variogram(data, binsize,
axis=-1):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; data =
data.swapaxes(-1, axis)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
n = data.shape[-1]<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
resultsize = int(N.ceil(n /
float(binsize)))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; result
= N.zeros([resultsize], 
> data.dtype)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for i in
range(resultsize):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
j0 = max(i*binsize,
1)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
j1 = min(j0+binsize,
n)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
denominator =
0<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
for j in range(j0,
j1):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
d2 = (data[...,j:] - data[...,:-j])**2
>
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
result[i] +=
d2.sum()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
denominator +=
N.prod(d2.shape)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
result[i] /= denominator<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
return result<br><br><br><br><br><div><span class="gmail_quote">On
6/22/07, <b class="gmail_sendername">
> Hanno Klemm</b> &lt;<a href="mailto:klemm@phys.ethz.ch"
target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">klemm@phys.ethz.ch
> </a>&gt; wrote:</span><blockquote class="gmail_quote"
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt
0.8ex; padding-left: 1ex;">
> Tim,<br><br>this is the best I could come up with until
now:<br><br><br>import numpy as N<br><br>def naive_variogram(data,
binsize=100.,
stepsize=5.):<br>&nbsp;&nbsp;&nbsp;&nbsp;&quot;&quot;&quot;calculates
variograms along the rows and columns of the given
> <br>&nbsp;&nbsp;&nbsp;&nbsp;array which is supposed to contain
equally spaced data with<br>&nbsp;&nbsp;&nbsp;&nbsp;stepsize
stepsize&quot;&quot;&quot;<br><br>&nbsp;&nbsp;&nbsp;&nbsp;# how many
elements do fit in one bin?<br><br>&nbsp;&nbsp;&nbsp;&nbsp;binlength =
int(binsize/stepsize)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;#bins in x- and
y- direction (+1 for the possible
> <br>&nbsp;&nbsp;&nbsp;&nbsp;#elements larger than
int(binsize/stepsize):<br>&nbsp;&nbsp;&nbsp;&nbsp;x_bins =
(data.shape[1])/binlength+1<br>&nbsp;&nbsp;&nbsp;&nbsp;y_bins =
(data.shape[0])/binlength+1<br><br>&nbsp;&nbsp;&nbsp;&nbsp;#arrays to
store the reuslts in<br>&nbsp;&nbsp;&nbsp;&nbsp;x_result =
N.zeros(x_bins, dtype = float)
> <br>&nbsp;&nbsp;&nbsp;&nbsp;y_result = N.zeros(y_bins, dtype =
float)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;#arrays to get teh denominators
right<br>&nbsp;&nbsp;&nbsp;&nbsp;x_denominators = N.zeros(x_bins,
dtype=float)<br>&nbsp;&nbsp;&nbsp;&nbsp;y_denominators =
N.zeros(x_bins, dtype=float)<br><br>&nbsp;&nbsp;&nbsp;&nbsp;#what is
the last index?
> <br>&nbsp;&nbsp;&nbsp;&nbsp;xlast =
data.shape[1]<br>&nbsp;&nbsp;&nbsp;&nbsp;ylast =
data.shape[0]<br>&nbsp;&nbsp;&nbsp;&nbsp;for i in
range(data.shape[0]):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;datasquared
= (data -
data[:,i])**2<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#number
of bins to fill until the end of the
array:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;numbins = 1
+ (xlast - i)/binlength
> <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for j in
range(numbins):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x_result[j]+=\<br>datasquared[:,i+1+j*binlength:i+1+(j+1)*binlength].sum()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x_denominators[j]
+=\<br>datasquared[:,i+1+j*binlength:i+1+(j+1)*binlength].size
>
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#Is
there a
rest?<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x_result[numbins]
+=<br>datasquared[:,i+1+numbins*binlength:].sum()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x_denominators[numbins]
+=<br>datasquared[:,i+1+numbins*binlength:].size
> <br>
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except
IndexError:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pass<br><br>&nbsp;&nbsp;&nbsp;&nbsp;x_result
/= x_denominators<br><br><br>&nbsp;&nbsp;&nbsp;&nbsp;for i in
range(data.shape[1]):<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;datasquared
= (data -
data[i])**2<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#number
of bins to fill until the end of the array:
> <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;numbins = 1 +
(ylast -
i)/binlength<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#Fill
the bins<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for j in
range(numbins):<br><br>y_result[j]+=datasquared[i+1+j*binlength:i+1+(j+1)*binlength].sum()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;y_denominators[j]
+=
> <br>
>
datasquared[i+1+j*binlength:i+1+(j+1)*binlength].size<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;try:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;#Is
there a
rest?<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;y_result[numbins]
+=<br>datasquared[:,i+1+numbins*binlength:].sum()<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;y_denominators[numbins]
+=
>
<br>datasquared[:,i+1+numbins*binlength:].size<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;except
IndexError:<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pass<br><br>&nbsp;&nbsp;&nbsp;&nbsp;y_result
/= y_denominators<br><br>&nbsp;&nbsp;&nbsp;&nbsp;return x_result,
y_result<br><br>Thanks,<br>Hanno<br><br><br>Timothy Hochberg &lt;
> <a href="mailto:tim.hochberg@ieee.org" target="_blank"
onclick="return
top.js.OpenExtLink(window,event,this)">tim.hochberg@ieee.org</a>&gt;
said:<br><br>&gt; ------=_Part_157389_1558912.1182523880067<br>&gt;
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> <br>&gt; Content-Transfer-Encoding: 7bit
> <br>&gt; Content-Disposition: inline<br>&gt;<br>&gt; On 6/22/07,
Hanno Klemm &lt;<a href="mailto:klemm@phys.ethz.ch" target="_blank"
onclick="return
top.js.OpenExtLink(window,event,this)">klemm@phys.ethz.ch</a>&gt; wrote:
> <br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; Hi,<br>&gt; &gt;<br>&gt;
&gt; I have an array which represents regularly spaced spatial data. I now
> <br>&gt; &gt; would like to compute the (semi-)variogram,
i.e.<br>&gt; &gt;<br>&gt; &gt; gamma(h) = 1/N(h) \sum_{i,j\in N(h)}
(z_i - z_j)**2,<br>&gt; &gt;<br>&gt; &gt; where h is the (approximate)
spatial difference between the
> <br>&gt; &gt; measurements z_i, and z_j, and N(h) is the number of
measurements with<br>&gt; &gt; distance h.<br>&gt; &gt;<br>&gt; &gt;
However, I only want to calculate the thing along the rows and<br>&gt;
&gt; columns. The naive approach involves two for loops and a lot of
> <br>&gt; &gt; searching, which becomes painfully slow on large data
sets. Are there<br>&gt; &gt; better implementations around in
numpy/scipy or does anyone have a<br>&gt; &gt; good idea of how to do
that more efficient? I looked around a bit but
> <br>&gt; &gt; couldn&#39;t find anything.<br>&gt;<br>&gt;<br>&gt;
Can you send the naive code as well. Its often easier to see
what&#39;s<br>going on<br>&gt; with code in addition to the
equations.<br>&gt;<br>&gt; Regards.
> <br>&gt;<br>&gt; -tim<br>&gt;<br>&gt;<br>&gt;<br>&gt; --<br>&gt;
&nbsp;&nbsp;__<br>&gt; .&nbsp;&nbsp; |-\<br>&gt; .<br>&gt;
&nbsp;&nbsp;<a href="mailto:tim.hochberg@ieee.org" target="_blank"
onclick="return
top.js.OpenExtLink(window,event,this)">tim.hochberg@ieee.org
> </a><br>&gt;<br>&gt; ------=_Part_157389_1558912.1182523880067
> <br>&gt; Content-Type: text/html; charset=ISO-8859-1<br>&gt;
Content-Transfer-Encoding: 7bit<br>&gt; Content-Disposition:
inline<br>&gt;<br>&gt; &lt;br&gt;&lt;br&gt;&lt;div&gt;&lt;span
class=&quot;gmail_quote&quot;&gt;On 6/22/07, &lt;b
> <br>class=&quot;gmail_sendername&quot;&gt;Hanno Klemm&lt;/b&gt;
&amp;lt;&lt;a<br>href=&quot;mailto:<a href="mailto:klemm@phys.ethz.ch"
target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">klemm@phys.ethz.ch
> </a>&quot;&gt;<a href="mailto:klemm@phys.ethz.ch" target="_blank"
onclick="return top.js.OpenExtLink(window,event,this)">klemm@phys.ethz.ch
> </a>&lt;/a&gt;&amp;gt;<br>wrote:&lt;/span&gt;&lt;blockquote
class=&quot;gmail_quote&quot; style=&quot;border-left: 1px<br>solid
rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left:
1ex;&quot;&gt;<br>&gt; &lt;br&gt;Hi,&lt;br&gt;&lt;br&gt;I have an
array which represents regularly spaced
> <br>spatial data. I now&lt;br&gt;would like to compute the
(semi-)variogram,<br>i.e.&lt;br&gt;&lt;br&gt;gamma(h) = 1/N(h)
\sum_{i,j\in N(h)} (z_i -<br>z_j)**2,&lt;br&gt;&lt;br&gt;where h is
the (approximate) spatial difference
> <br>between the<br>&gt; &lt;br&gt;measurements z_i, and z_j, and
N(h) is the number of<br>measurements with&lt;br&gt;distance
h.&lt;br&gt;&lt;br&gt;However, I only want to<br>calculate the thing
along the rows and&lt;br&gt;columns. The naive approach
> <br>involves two for loops and a lot of<br>&gt; &lt;br&gt;searching,
which becomes painfully slow on large data sets.
Are<br>there&lt;br&gt;better implementations around in numpy/scipy or
does anyone<br>have a&lt;br&gt;good idea of how to do that more
efficient? I looked around
> <br>a bit but&lt;br&gt;couldn&amp;#39;t find anything.<br>&gt;
&lt;/blockquote&gt;&lt;div&gt;&lt;br&gt;Can you send the naive code as
well. Its often<br>easier to see what&amp;#39;s going on with code in
addition to the<br>
> 
> 
>
equations.&lt;br&gt;&lt;br&gt;Regards.&lt;br&gt;&lt;br&gt;-tim&lt;br&gt;&lt;br&gt;&lt;/div&gt;&lt;/div&gt;&lt;br<br>clear=&quot;all&quot;&gt;&lt;br&gt;--
&lt;br&gt;.&amp;nbsp;&amp;nbsp;__<br>&gt;
&lt;br&gt;.&amp;nbsp;&amp;nbsp;
|-\&lt;br&gt;.&lt;br&gt;.&amp;nbsp;&amp;nbsp;&lt;a
> <br>href=&quot;mailto:<a href="mailto:tim.hochberg@ieee.org"
target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">tim.hochberg@ieee.org</a>&quot;&gt;<a
href="mailto:tim.hochberg@ieee.org" target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">
> 
> tim.hochberg@ieee.org</a>&lt;/a&gt;<br>&gt;<br>&gt;
------=_Part_157389_1558912.1182523880067--
> <br>&gt;<br><br><br><br>--<br>Hanno Klemm<br><a
href="mailto:klemm@phys.ethz.ch" target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">klemm@phys.ethz.ch</a><br><br><br>_______________________________________________
> <br>Numpy-discussion mailing list<br><a
href="mailto:Numpy-discussion@scipy.org" target="_blank"
onclick="return top.js.OpenExtLink(window,event,this)">
> Numpy-discussion@scipy.org</a><br><a
href="http://projects.scipy.org/mailman/listinfo/numpy-discussion"
target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">http://projects.scipy.org/mailman/listinfo/numpy-discussion
> </a><br></blockquote></div><br><br clear="all"><br>-- <br>
> .&nbsp;&nbsp;__<br>.&nbsp;&nbsp; |-\<br>.<br>.&nbsp;&nbsp;<a
href="mailto:tim.hochberg@ieee.org" target="_blank" onclick="return
top.js.OpenExtLink(window,event,this)">tim.hochberg@ieee.org</a>
> 
> ------=_Part_160363_13970610.1182535377481--
> 



-- 
Hanno Klemm
klemm@phys.ethz.ch




More information about the Numpy-discussion mailing list