[SciPy-User] re[SciPy-user] moving for loops...

mdekauwe mdekauwe@gmail....
Thu Jun 10 15:36:22 CDT 2010


OK I think it is clear now!! Although what does the -1 bit do, this is surely
the same as saying 11, 12 or numyears, nummonths?

thanks.



Benjamin Root-2 wrote:
> 
> Well, let's try a more direct example.  I am going to create a 4d array of
> random values to illustrate.  I know the length of the dimensions won't be
> exactly the same as yours, but the example will still be valid.
> 
> In this example, I will be able to calculate *all* of the monthly averages
> for *all* of the variables for *all* of the grid points without a single
> loop.
> 
>> jules = np.random.random((132, 10, 50, 3))
>> print jules.shape
> (132, 10, 50, 3)
> 
>> jules_5d = np.reshape(jules, (-1, 12) + jules.shape[1:])
>> print jules_5d.shape
> (11, 12, 10, 50, 3)
> 
>> jules_5d = np.ma.masked_array(jules_5d, mask=jules_5d < 0.0)
> 
>> jules_means = np.mean(jules_5d, axis=0)
>> print jules_means.shape
> (12, 10, 50, 3)
> 
> voila! This matrix has a mean for each month across all eleven years for
> each datapoint in each of the 10 variables at each (I am assuming) level
> in
> the atmosphere.
> 
> So, if you want to operate on a subset of your jules matrix (for example,
> you need to do special masking for each variable), then you can just work
> off of a slice of the original matrix, and many of these same concepts in
> this example and the previous example still applies.
> 
> Ben Root
> 
> 
> On Thu, Jun 10, 2010 at 1:08 PM, mdekauwe <mdekauwe@gmail.com> wrote:
> 
>>
>> Hi,
>>
>> No if I am honest I am a little confused how what you are suggesting
>> would
>> work. As I see it the array I am trying to average from has dims
>> jules[(numyears * nummonths),1,numpts,0]. Where the first dimension (132)
>> is
>> 12 months x 11 years. And as I said before I would like to average the
>> jan
>> from the first, second, third years etc. Then the same for the feb and so
>> on.
>>
>> So I don't see how you get to your 2d array that you mention in the first
>> line? I thought what you were suggesting was I could precompute the step
>> that builds the index for the months e.g
>>
>> mth_index = np.zeros(0)
>> for month in xrange(nummonths):
>>     mth_index = np.append(mth_index, np.arange(month, numyears *
>> nummonths,
>> nummonths))
>>
>> and use this as my index to skip the for loop. Though I still have a for
>> loop I guess!
>>
>>
>>
>>
>>
>>
>> Benjamin Root-2 wrote:
>> >
>> > Correction for me as well.  To mask out the negative values, use masked
>> > arrays.  So we will turn jules_2d into a masked array (second line),
>> then
>> > all subsequent commands will still work as expected.  It is very
>> similar
>> > to
>> > replacing negative values with nans and using nanmin().
>> >
>> >> jules_2d = jules.reshape((-1, 12))
>> >> jules_2d = np.ma.masked_array(jules_2d, mask=jules_2d < 0.0)
>> >> jules_monthly = np.mean(jules_2d, axis=0)
>> >> print jules_monthly.shape
>> >   (12,)
>> >
>> > Ben Root
>> >
>> > On Tue, Jun 8, 2010 at 7:49 PM, Benjamin Root <ben.root@ou.edu> wrote:
>> >
>> >> The np.mod in my example caused the data points to stay within [0, 11]
>> in
>> >> order to illustrate that these are months.  In my example, months are
>> >> column, years are rows.  In your desired output, months are rows and
>> >> years
>> >> are columns.  It makes very little difference which way you have it.
>> >>
>> >> Anyway, let's imagine that we have a time series of data "jules".  We
>> can
>> >> easily reshape this like so:
>> >>
>> >> > jules_2d = jules.reshape((-1, 12))
>> >> > jules_monthly = np.mean(jules_2d, axis=0)
>> >> > print jules_monthly.shape
>> >>   (12,)
>> >>
>> >> voila!  You have 12 values in jules_monthly which are means for that
>> >> month
>> >> across all years.
>> >>
>> >> protip - if you want yearly averages just change the ax parameter in
>> >> np.mean():
>> >> > jules_yearly = np.mean(jules_2d, axis=1)
>> >>
>> >> I hope that makes my previous explanation clearer.
>> >>
>> >> Ben Root
>> >>
>> >>
>> >> On Tue, Jun 8, 2010 at 5:41 PM, mdekauwe <mdekauwe@gmail.com> wrote:
>> >>
>> >>>
>> >>> OK...
>> >>>
>> >>> but if I do...
>> >>>
>> >>> In [28]: np.mod(np.arange(nummonths*numyears),
>> nummonths).reshape((-1,
>> >>> nummonths))
>> >>> Out[28]:
>> >>> array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])
>> >>>
>> >>> When really I would be after something like this I think?
>> >>>
>> >>> array([  0,  12,  24,  36,  48,  60,  72,  84,  96, 108, 120],
>> >>>        [  1,  13,  25,  37,  49,  61,  73,  85,  97, 109, 121],
>> >>>        [  2,  14,  26,  38,  50,  62,  74,  86,  98, 110, 122]
>> >>>        etc, etc
>> >>>
>> >>> i.e. so for each month jump across the years.
>> >>>
>> >>> Not quite sure of this example...this is what I currently have which
>> >>> does
>> >>> seem to work, though I guess not completely efficiently.
>> >>>
>> >>> for month in xrange(nummonths):
>> >>>         tmp = jules[xrange(0, numyears * nummonths,
>> nummonths),VAR,:,0]
>> >>>        tmp[tmp < 0.0] = np.nan
>> >>>        data[month,:] = np.mean(tmp, axis=0)
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> Benjamin Root-2 wrote:
>> >>> >
>> >>> > If you want an average for each month from your timeseries, then
>> the
>> >>> > sneaky
>> >>> > way would be to reshape your array so that the time dimension is
>> split
>> >>> > into
>> >>> > two (month, year) dimensions.
>> >>> >
>> >>> > For a 1-D array, this would be:
>> >>> >
>> >>> >> dataarray = numpy.mod(numpy.arange(36), 12)
>> >>> >> print dataarray
>> >>> > array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  0,  1,  2,
>>  3,
>> >>>  4,
>> >>> >         5,  6,  7,  8,  9, 10, 11,  0,  1,  2,  3,  4,  5,  6,  7,
>>  8,
>> >>>  9,
>> >>> >        10, 11])
>> >>> >> datamatrix = dataarray.reshape((-1, 12))
>> >>> >> print datamatrix
>> >>> > array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>> >        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
>> >>> >        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])
>> >>> >
>> >>> > Hope that helps.
>> >>> >
>> >>> > Ben Root
>> >>> >
>> >>> >
>> >>> > On Fri, May 28, 2010 at 3:28 PM, mdekauwe <mdekauwe@gmail.com>
>> wrote:
>> >>> >
>> >>> >>
>> >>> >> OK so I just need to have a quick loop across the 12 months then,
>> >>> that
>> >>> is
>> >>> >> fine, just thought there might have been a sneaky way!
>> >>> >>
>> >>> >> Really appreciated, getting there slowly!
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> josef.pktd wrote:
>> >>> >> >
>> >>> >> > On Fri, May 28, 2010 at 4:14 PM, mdekauwe <mdekauwe@gmail.com>
>> >>> wrote:
>> >>> >> >>
>> >>> >> >> ok - something like this then...but how would i get the index
>> for
>> >>> the
>> >>> >> >> month
>> >>> >> >> for the data array (where month is 0, 1, 2, 4 ... 11)?
>> >>> >> >>
>> >>> >> >> data[month,:] = array[xrange(0, numyears * nummonths,
>> >>> >> nummonths),VAR,:,0]
>> >>> >> >
>> >>> >> > you would still need to start at the right month
>> >>> >> > data[month,:] = array[xrange(month, numyears * nummonths,
>> >>> >> > nummonths),VAR,:,0]
>> >>> >> > or
>> >>> >> > data[month,:] = array[month: numyears * nummonths :
>> >>> nummonths),VAR,:,0]
>> >>> >> >
>> >>> >> > an alternative would be a reshape with an extra month dimension
>> and
>> >>> >> > then sum only once over the year axis. this might be faster but
>> >>> >> > trickier to get the correct reshape .
>> >>> >> >
>> >>> >> > Josef
>> >>> >> >
>> >>> >> >>
>> >>> >> >> and would that be quicker than making an array months...
>> >>> >> >>
>> >>> >> >> months = np.arange(numyears * nummonths)
>> >>> >> >>
>> >>> >> >> and you that instead like you suggested x[start:end:12,:]?
>> >>> >> >>
>> >>> >> >> Many thanks again...
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> josef.pktd wrote:
>> >>> >> >>>
>> >>> >> >>> On Fri, May 28, 2010 at 3:53 PM, mdekauwe <mdekauwe@gmail.com>
>> >>> wrote:
>> >>> >> >>>>
>> >>> >> >>>> Ok thanks...I'll take a look.
>> >>> >> >>>>
>> >>> >> >>>> Back to my loops issue. What if instead this time I wanted to
>> >>> take
>> >>> >> an
>> >>> >> >>>> average so every march in 11 years, is there a quicker way to
>> go
>> >>> >> about
>> >>> >> >>>> doing
>> >>> >> >>>> that than my current method?
>> >>> >> >>>>
>> >>> >> >>>> nummonths = 12
>> >>> >> >>>> numyears = 11
>> >>> >> >>>>
>> >>> >> >>>> for month in xrange(nummonths):
>> >>> >> >>>>    for i in xrange(numpts):
>> >>> >> >>>>        for ym in xrange(month, numyears * nummonths,
>> nummonths):
>> >>> >> >>>>            data[month, i] += array[ym, VAR,
>> land_pts_index[i],
>> >>> 0]
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> x[start:end:12,:] gives you every 12th row of an array x
>> >>> >> >>>
>> >>> >> >>> something like this should work to get rid of the inner loop,
>> or
>> >>> you
>> >>> >> >>> could directly put
>> >>> >> >>> range(month, numyears * nummonths, nummonths) into the array
>> >>> instead
>> >>> >> >>> of ym and sum()
>> >>> >> >>>
>> >>> >> >>> Josef
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>>
>> >>> >> >>>> so for each point in the array for a given month i am jumping
>> >>> >> through
>> >>> >> >>>> and
>> >>> >> >>>> getting the next years month and so on, summing it.
>> >>> >> >>>>
>> >>> >> >>>> Thanks...
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>> josef.pktd wrote:
>> >>> >> >>>>>
>> >>> >> >>>>> On Wed, May 26, 2010 at 5:03 PM, mdekauwe
>> <mdekauwe@gmail.com
>> >
>> >>> >> wrote:
>> >>> >> >>>>>>
>> >>> >> >>>>>> Could you possibly if you have time explain further your
>> >>> comment
>> >>> >> re
>> >>> >> >>>>>> the
>> >>> >> >>>>>> p-values, your suggesting I am misusing them?
>> >>> >> >>>>>
>> >>> >> >>>>> Depends on your use and interpretation
>> >>> >> >>>>>
>> >>> >> >>>>> test statistics, p-values are random variables, if you look
>> at
>> >>> >> several
>> >>> >> >>>>> tests at the same time, some p-values will be large just by
>> >>> chance.
>> >>> >> >>>>> If, for example you just look at the largest test statistic,
>> >>> then
>> >>> >> the
>> >>> >> >>>>> distribution for the max of several test statistics is not
>> the
>> >>> same
>> >>> >> as
>> >>> >> >>>>> the distribution for a single test statistic
>> >>> >> >>>>>
>> >>> >> >>>>> http://en.wikipedia.org/wiki/Multiple_comparisons
>> >>> >> >>>>>
>> http://www.itl.nist.gov/div898/handbook/prc/section4/prc47.htm
>> >>> >> >>>>>
>> >>> >> >>>>> we also just had a related discussion for ANOVA post-hoc
>> tests
>> >>> on
>> >>> >> the
>> >>> >> >>>>> pystatsmodels group.
>> >>> >> >>>>>
>> >>> >> >>>>> Josef
>> >>> >> >>>>>>
>> >>> >> >>>>>> Thanks.
>> >>> >> >>>>>>
>> >>> >> >>>>>>
>> >>> >> >>>>>> josef.pktd wrote:
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> On Sat, May 22, 2010 at 6:21 AM, mdekauwe
>> >>> <mdekauwe@gmail.com>
>> >>> >> >>>>>>> wrote:
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> Sounds like I am stuck with the loop as I need to do the
>> >>> >> comparison
>> >>> >> >>>>>>>> for
>> >>> >> >>>>>>>> each
>> >>> >> >>>>>>>> pixel of the world and then I have a basemap function
>> call
>> >>> which
>> >>> >> I
>> >>> >> >>>>>>>> guess
>> >>> >> >>>>>>>> slows it down further...hmm
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> I don't see much that could be done differently, after a
>> >>> brief
>> >>> >> look.
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> stats.pearsonr could be replaced by an array version using
>> >>> >> directly
>> >>> >> >>>>>>> the formula for correlation even with nans. wilcoxon looks
>> >>> slow,
>> >>> >> and
>> >>> >> >>>>>>> I
>> >>> >> >>>>>>> never tried or seen a faster version.
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> just a reminder, the p-values are for a single test, when
>> you
>> >>> >> have
>> >>> >> >>>>>>> many of them, then they don't have the right
>> size/confidence
>> >>> >> level
>> >>> >> >>>>>>> for
>> >>> >> >>>>>>> an overall or joint test. (some packages report a
>> Bonferroni
>> >>> >> >>>>>>> correction in this case)
>> >>> >> >>>>>>>
>> >>> >> >>>>>>> Josef
>> >>> >> >>>>>>>
>> >>> >> >>>>>>>
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> i.e.
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> def compareSnowData(jules_var):
>> >>> >> >>>>>>>>    # Extract the 11 years of snow data and return
>> >>> >> >>>>>>>>    outrows = 180
>> >>> >> >>>>>>>>    outcols = 360
>> >>> >> >>>>>>>>    numyears = 11
>> >>> >> >>>>>>>>    nummonths = 12
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    # Read various files
>> >>> >> >>>>>>>>    fname="world_valid_jules_pts.ascii"
>> >>> >> >>>>>>>>    (numpts, land_pts_index, latitude, longitude, rows,
>> cols)
>> >>> =
>> >>> >> >>>>>>>> jo.read_land_points_ascii(fname, 1.0)
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    fname = "globalSnowRun_1985_96.GSWP2.nsmax0.mon.gra"
>> >>> >> >>>>>>>>    jules_data1 = jo.readJulesOutBinary(fname,
>> numrows=15238,
>> >>> >> >>>>>>>> numcols=1,
>> >>> >> >>>>>>>> \
>> >>> >> >>>>>>>>                       timesteps=132, numvars=26)
>> >>> >> >>>>>>>>    fname = "globalSnowRun_1985_96.GSWP2.nsmax3.mon.gra"
>> >>> >> >>>>>>>>    jules_data2 = jo.readJulesOutBinary(fname,
>> numrows=15238,
>> >>> >> >>>>>>>> numcols=1,
>> >>> >> >>>>>>>> \
>> >>> >> >>>>>>>>                       timesteps=132, numvars=26)
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    # grab some space
>> >>> >> >>>>>>>>    data1_snow = np.zeros((nummonths * numyears, numpts),
>> >>> >> >>>>>>>> dtype=np.float32)
>> >>> >> >>>>>>>>    data2_snow = np.zeros((nummonths * numyears, numpts),
>> >>> >> >>>>>>>> dtype=np.float32)
>> >>> >> >>>>>>>>    pearsonsr_snow = np.ones((outrows, outcols),
>> >>> >> dtype=np.float32)
>> >>> >> *
>> >>> >> >>>>>>>> np.nan
>> >>> >> >>>>>>>>    wilcoxStats_snow = np.ones((outrows, outcols),
>> >>> >> dtype=np.float32)
>> >>> >> >>>>>>>> *
>> >>> >> >>>>>>>> np.nan
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    # extract the data
>> >>> >> >>>>>>>>    data1_snow = jules_data1[:,jules_var,:,0]
>> >>> >> >>>>>>>>    data2_snow = jules_data2[:,jules_var,:,0]
>> >>> >> >>>>>>>>    data1_snow = np.where(data1_snow < 0.0, np.nan,
>> >>> data1_snow)
>> >>> >> >>>>>>>>    data2_snow = np.where(data2_snow < 0.0, np.nan,
>> >>> data2_snow)
>> >>> >> >>>>>>>>    #for month in xrange(numyears * nummonths):
>> >>> >> >>>>>>>>    #    for i in xrange(numpts):
>> >>> >> >>>>>>>>    #        data1 =
>> >>> >> >>>>>>>> jules_data1[month,jules_var,land_pts_index[i],0]
>> >>> >> >>>>>>>>    #        data2 =
>> >>> >> >>>>>>>> jules_data2[month,jules_var,land_pts_index[i],0]
>> >>> >> >>>>>>>>    #        if data1 >= 0.0:
>> >>> >> >>>>>>>>    #            data1_snow[month,i] = data1
>> >>> >> >>>>>>>>    #        else:
>> >>> >> >>>>>>>>    #            data1_snow[month,i] = np.nan
>> >>> >> >>>>>>>>    #        if data2 > 0.0:
>> >>> >> >>>>>>>>    #            data2_snow[month,i] = data2
>> >>> >> >>>>>>>>    #        else:
>> >>> >> >>>>>>>>    #            data2_snow[month,i] = np.nan
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    # exclude any months from *both* arrays where we have
>> >>> dodgy
>> >>> >> >>>>>>>> data,
>> >>> >> >>>>>>>> else
>> >>> >> >>>>>>>> we
>> >>> >> >>>>>>>>    # can't do the correlations correctly!!
>> >>> >> >>>>>>>>    data1_snow = np.where(np.isnan(data2_snow), np.nan,
>> >>> >> data1_snow)
>> >>> >> >>>>>>>>    data2_snow = np.where(np.isnan(data1_snow), np.nan,
>> >>> >> data1_snow)
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    # put data on a regular grid...
>> >>> >> >>>>>>>>    print 'regridding landpts...'
>> >>> >> >>>>>>>>    for i in xrange(numpts):
>> >>> >> >>>>>>>>        # exclude the NaN, note masking them doesn't work
>> in
>> >>> the
>> >>> >> >>>>>>>> stats
>> >>> >> >>>>>>>> func
>> >>> >> >>>>>>>>        x = data1_snow[:,i]
>> >>> >> >>>>>>>>        x = x[np.isfinite(x)]
>> >>> >> >>>>>>>>        y = data2_snow[:,i]
>> >>> >> >>>>>>>>        y = y[np.isfinite(y)]
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>        # r^2
>> >>> >> >>>>>>>>        # exclude v.small arrays, i.e. we need just less
>> over
>> >>> 4
>> >>> >> >>>>>>>> years
>> >>> >> >>>>>>>> of
>> >>> >> >>>>>>>> data
>> >>> >> >>>>>>>>        if len(x) and len(y) > 50:
>> >>> >> >>>>>>>>           
>> pearsonsr_snow[((180-1)-(rows[i]-1)),cols[i]-1]
>> =
>> >>> >> >>>>>>>> (stats.pearsonr(x, y)[0])**2
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>        # wilcox signed rank test
>> >>> >> >>>>>>>>        # make sure we have enough samples to do the test
>> >>> >> >>>>>>>>        d = x - y
>> >>> >> >>>>>>>>        d = np.compress(np.not_equal(d,0), d ,axis=-1) #
>> Keep
>> >>> all
>> >>> >> >>>>>>>> non-zero
>> >>> >> >>>>>>>> differences
>> >>> >> >>>>>>>>        count = len(d)
>> >>> >> >>>>>>>>        if count > 10:
>> >>> >> >>>>>>>>            z, pval = stats.wilcoxon(x, y)
>> >>> >> >>>>>>>>            # only map out sign different data
>> >>> >> >>>>>>>>            if pval < 0.05:
>> >>> >> >>>>>>>>
>> >>>  wilcoxStats_snow[((180-1)-(rows[i]-1)),cols[i]-1]
>> >>> >> =
>> >>> >> >>>>>>>> np.mean(x - y)
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>    return (pearsonsr_snow, wilcoxStats_snow)
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> josef.pktd wrote:
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> On Fri, May 21, 2010 at 10:14 PM, mdekauwe <
>> >>> mdekauwe@gmail.com>
>> >>> >> >>>>>>>>> wrote:
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> Also I then need to remap the 2D array I make onto
>> another
>> >>> >> grid
>> >>> >> >>>>>>>>>> (the
>> >>> >> >>>>>>>>>> world in
>> >>> >> >>>>>>>>>> this case). Which again I had am doing with a loop
>> (note
>> >>> >> numpts
>> >>> >> >>>>>>>>>> is
>> >>> >> >>>>>>>>>> a
>> >>> >> >>>>>>>>>> lot
>> >>> >> >>>>>>>>>> bigger than my example above).
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> wilcoxStats_snow = np.ones((outrows, outcols),
>> >>> >> dtype=np.float32)
>> >>> >> >>>>>>>>>> *
>> >>> >> >>>>>>>>>> np.nan
>> >>> >> >>>>>>>>>> for i in xrange(numpts):
>> >>> >> >>>>>>>>>>        # exclude the NaN, note masking them doesn't
>> work
>> >>> in
>> >>> >> the
>> >>> >> >>>>>>>>>> stats
>> >>> >> >>>>>>>>>> func
>> >>> >> >>>>>>>>>>        x = data1_snow[:,i]
>> >>> >> >>>>>>>>>>        x = x[np.isfinite(x)]
>> >>> >> >>>>>>>>>>        y = data2_snow[:,i]
>> >>> >> >>>>>>>>>>        y = y[np.isfinite(y)]
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>>        # wilcox signed rank test
>> >>> >> >>>>>>>>>>        # make sure we have enough samples to do the
>> test
>> >>> >> >>>>>>>>>>        d = x - y
>> >>> >> >>>>>>>>>>        d = np.compress(np.not_equal(d,0), d ,axis=-1) #
>> >>> Keep
>> >>> >> all
>> >>> >> >>>>>>>>>> non-zero
>> >>> >> >>>>>>>>>> differences
>> >>> >> >>>>>>>>>>        count = len(d)
>> >>> >> >>>>>>>>>>        if count > 10:
>> >>> >> >>>>>>>>>>            z, pval = stats.wilcoxon(x, y)
>> >>> >> >>>>>>>>>>            # only map out sign different data
>> >>> >> >>>>>>>>>>            if pval < 0.05:
>> >>> >> >>>>>>>>>>
>> >>> >> wilcoxStats_snow[((180-1)-(rows[i]-1)),cols[i]-1]
>> >>> >> >>>>>>>>>> =
>> >>> >> >>>>>>>>>> np.mean(x - y)
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> Now I think I can push the data in one move into the
>> >>> >> >>>>>>>>>> wilcoxStats_snow
>> >>> >> >>>>>>>>>> array
>> >>> >> >>>>>>>>>> by removing the index,
>> >>> >> >>>>>>>>>> but I can't see how I will get the individual x and y
>> pts
>> >>> for
>> >>> >> >>>>>>>>>> each
>> >>> >> >>>>>>>>>> array
>> >>> >> >>>>>>>>>> member correctly without the loop, this was my attempt
>> >>> which
>> >>> >> of
>> >>> >> >>>>>>>>>> course
>> >>> >> >>>>>>>>>> doesn't work!
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> x = data1_snow[:,:]
>> >>> >> >>>>>>>>>> x = x[np.isfinite(x)]
>> >>> >> >>>>>>>>>> y = data2_snow[:,:]
>> >>> >> >>>>>>>>>> y = y[np.isfinite(y)]
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> # r^2
>> >>> >> >>>>>>>>>> # exclude v.small arrays, i.e. we need just less over 4
>> >>> years
>> >>> >> of
>> >>> >> >>>>>>>>>> data
>> >>> >> >>>>>>>>>> if len(x) and len(y) > 50:
>> >>> >> >>>>>>>>>>    pearsonsr_snow[((180-1)-(rows-1)),cols-1] =
>> >>> >> (stats.pearsonr(x,
>> >>> >> >>>>>>>>>> y)[0])**2
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> If you want to do pairwise comparisons with
>> stats.wilcoxon,
>> >>> >> then
>> >>> >> >>>>>>>>> you
>> >>> >> >>>>>>>>> might be stuck with the loop, since wilcoxon takes only
>> two
>> >>> 1d
>> >>> >> >>>>>>>>> arrays
>> >>> >> >>>>>>>>> at a time (if I read the help correctly).
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> Also the presence of nans might force the use a loop.
>> >>> >> stats.mstats
>> >>> >> >>>>>>>>> has
>> >>> >> >>>>>>>>> masked array versions, but I didn't see wilcoxon in the
>> >>> list.
>> >>> >> >>>>>>>>> (Even
>> >>> >> >>>>>>>>> when vectorized operations would work with regular
>> arrays,
>> >>> nan
>> >>> >> or
>> >>> >> >>>>>>>>> masked array versions still have to loop in many cases.)
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> If you have many columns with count <= 10, so that
>> wilcoxon
>> >>> is
>> >>> >> not
>> >>> >> >>>>>>>>> calculated then it might be worth to use only array
>> >>> operations
>> >>> >> up
>> >>> >> >>>>>>>>> to
>> >>> >> >>>>>>>>> that point. If wilcoxon is calculated most of the time,
>> >>> then
>> >>> >> it's
>> >>> >> >>>>>>>>> not
>> >>> >> >>>>>>>>> worth thinking too hard about this.
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> Josef
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> thanks.
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> mdekauwe wrote:
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>> Yes as Zachary said index is only 0 to 15237, so both
>> >>> methods
>> >>> >> >>>>>>>>>>> work.
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>> I don't quite get what you mean about slicing with
>> axis
>> >
>> >>> 3.
>> >>> >> Is
>> >>> >> >>>>>>>>>>> there
>> >>> >> >>>>>>>>>>> a
>> >>> >> >>>>>>>>>>> link you can recommend I should read? Does that mean
>> >>> given
>> >>> I
>> >>> >> >>>>>>>>>>> have
>> >>> >> >>>>>>>>>>> 4dims
>> >>> >> >>>>>>>>>>> that Josef's suggestion would be more advised in this
>> >>> case?
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> There were several discussions on the mailing lists
>> (fancy
>> >>> >> slicing
>> >>> >> >>>>>>>>> and
>> >>> >> >>>>>>>>> indexing). Your case is safe, but if you run in future
>> into
>> >>> >> funny
>> >>> >> >>>>>>>>> shapes, you can look up the details.
>> >>> >> >>>>>>>>> when in doubt, I use np.arange(...)
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>> Josef
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>> Thanks.
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>> josef.pktd wrote:
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>> On Fri, May 21, 2010 at 10:55 AM, mdekauwe <
>> >>> >> mdekauwe@gmail.com>
>> >>> >> >>>>>>>>>>>> wrote:
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> Thanks that works...
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> So the way to do it is with
>> np.arange(tsteps)[:,None],
>> >>> that
>> >>> >> >>>>>>>>>>>>> was
>> >>> >> >>>>>>>>>>>>> the
>> >>> >> >>>>>>>>>>>>> step
>> >>> >> >>>>>>>>>>>>> I
>> >>> >> >>>>>>>>>>>>> was struggling with, so this forms a 2D array which
>> >>> >> replaces
>> >>> >> >>>>>>>>>>>>> the
>> >>> >> >>>>>>>>>>>>> the
>> >>> >> >>>>>>>>>>>>> two
>> >>> >> >>>>>>>>>>>>> for
>> >>> >> >>>>>>>>>>>>> loops? Do I have that right?
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>> Yes, but as Zachary showed, if you need the full
>> index
>> >>> in
>> >>> a
>> >>> >> >>>>>>>>>>>> dimension,
>> >>> >> >>>>>>>>>>>> then you can use slicing. It might be faster.
>> >>> >> >>>>>>>>>>>> And a warning, mixing slices and index arrays with 3
>> or
>> >>> more
>> >>> >> >>>>>>>>>>>> dimensions can have some surprise switching of axes.
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>> Josef
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> A lot quicker...!
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> Martin
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> josef.pktd wrote:
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> On Fri, May 21, 2010 at 8:59 AM, mdekauwe
>> >>> >> >>>>>>>>>>>>>> <mdekauwe@gmail.com>
>> >>> >> >>>>>>>>>>>>>> wrote:
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> Hi,
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> I am trying to extract data from a 4D array and
>> store
>> >>> it
>> >>> >> in
>> >>> >> >>>>>>>>>>>>>>> a
>> >>> >> >>>>>>>>>>>>>>> 2D
>> >>> >> >>>>>>>>>>>>>>> array,
>> >>> >> >>>>>>>>>>>>>>> but
>> >>> >> >>>>>>>>>>>>>>> avoid my current usage of the for loops for speed,
>> as
>> >>> in
>> >>> >> >>>>>>>>>>>>>>> reality
>> >>> >> >>>>>>>>>>>>>>> the
>> >>> >> >>>>>>>>>>>>>>> arrays
>> >>> >> >>>>>>>>>>>>>>> sizes are quite big. Could someone also try and
>> >>> explain
>> >>> >> the
>> >>> >> >>>>>>>>>>>>>>> solution
>> >>> >> >>>>>>>>>>>>>>> as
>> >>> >> >>>>>>>>>>>>>>> well
>> >>> >> >>>>>>>>>>>>>>> if they have a spare moment as I am still finding
>> it
>> >>> >> quite
>> >>> >> >>>>>>>>>>>>>>> difficult
>> >>> >> >>>>>>>>>>>>>>> to
>> >>> >> >>>>>>>>>>>>>>> get
>> >>> >> >>>>>>>>>>>>>>> over the habit of using loops (C convert for my
>> >>> sins).
>> >>> I
>> >>> >> get
>> >>> >> >>>>>>>>>>>>>>> that
>> >>> >> >>>>>>>>>>>>>>> one
>> >>> >> >>>>>>>>>>>>>>> could
>> >>> >> >>>>>>>>>>>>>>> precompute the indices's i and j i.e.
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> i = np.arange(tsteps)
>> >>> >> >>>>>>>>>>>>>>> j = np.arange(numpts)
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> but just can't get my head round how i then use
>> >>> them...
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> Thanks,
>> >>> >> >>>>>>>>>>>>>>> Martin
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> import numpy as np
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> numpts=10
>> >>> >> >>>>>>>>>>>>>>> tsteps = 12
>> >>> >> >>>>>>>>>>>>>>> vari = 22
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> data = np.random.random((tsteps, vari, numpts, 1))
>> >>> >> >>>>>>>>>>>>>>> new_data = np.zeros((tsteps, numpts),
>> >>> dtype=np.float32)
>> >>> >> >>>>>>>>>>>>>>> index = np.arange(numpts)
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> for i in xrange(tsteps):
>> >>> >> >>>>>>>>>>>>>>>    for j in xrange(numpts):
>> >>> >> >>>>>>>>>>>>>>>        new_data[i,j] = data[i,5,index[j],0]
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> The index arrays need to be broadcastable against
>> each
>> >>> >> other.
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> I think this should do it
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> new_data = data[np.arange(tsteps)[:,None], 5,
>> >>> >> >>>>>>>>>>>>>> np.arange(numpts),
>> >>> >> >>>>>>>>>>>>>> 0]
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> Josef
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> --
>> >>> >> >>>>>>>>>>>>>>> View this message in context:
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28633477.html
>> >>> >> >>>>>>>>>>>>>>> Sent from the Scipy-User mailing list archive at
>> >>> >> Nabble.com.
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> --
>> >>> >> >>>>>>>>>>>>> View this message in context:
>> >>> >> >>>>>>>>>>>>>
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28634924.html
>> >>> >> >>>>>>>>>>>>> Sent from the Scipy-User mailing list archive at
>> >>> >> Nabble.com.
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>>
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>>
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> --
>> >>> >> >>>>>>>>>> View this message in context:
>> >>> >> >>>>>>>>>>
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28640656.html
>> >>> >> >>>>>>>>>> Sent from the Scipy-User mailing list archive at
>> >>> Nabble.com.
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>>
>> >>> >> >>>>>>>>> _______________________________________________
>> >>> >> >>>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>>
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> --
>> >>> >> >>>>>>>> View this message in context:
>> >>> >> >>>>>>>>
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28642434.html
>> >>> >> >>>>>>>> Sent from the Scipy-User mailing list archive at
>> Nabble.com.
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>>> _______________________________________________
>> >>> >> >>>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>>
>> >>> >> >>>>>>> _______________________________________________
>> >>> >> >>>>>>> SciPy-User mailing list
>> >>> >> >>>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>>
>> >>> >> >>>>>>>
>> >>> >> >>>>>>
>> >>> >> >>>>>> --
>> >>> >> >>>>>> View this message in context:
>> >>> >> >>>>>>
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28686356.html
>> >>> >> >>>>>> Sent from the Scipy-User mailing list archive at
>> Nabble.com.
>> >>> >> >>>>>>
>> >>> >> >>>>>> _______________________________________________
>> >>> >> >>>>>> SciPy-User mailing list
>> >>> >> >>>>>> SciPy-User@scipy.org
>> >>> >> >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>>
>> >>> >> >>>>> _______________________________________________
>> >>> >> >>>>> SciPy-User mailing list
>> >>> >> >>>>> SciPy-User@scipy.org
>> >>> >> >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>>
>> >>> >> >>>>>
>> >>> >> >>>>
>> >>> >> >>>> --
>> >>> >> >>>> View this message in context:
>> >>> >> >>>>
>> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28711249.html
>> >>> >> >>>> Sent from the Scipy-User mailing list archive at Nabble.com.
>> >>> >> >>>>
>> >>> >> >>>> _______________________________________________
>> >>> >> >>>> SciPy-User mailing list
>> >>> >> >>>> SciPy-User@scipy.org
>> >>> >> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>>
>> >>> >> >>> _______________________________________________
>> >>> >> >>> SciPy-User mailing list
>> >>> >> >>> SciPy-User@scipy.org
>> >>> >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >> --
>> >>> >> >> View this message in context:
>> >>> >> >>
>> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28711444.html
>> >>> >> >> Sent from the Scipy-User mailing list archive at Nabble.com.
>> >>> >> >>
>> >>> >> >> _______________________________________________
>> >>> >> >> SciPy-User mailing list
>> >>> >> >> SciPy-User@scipy.org
>> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >>
>> >>> >> > _______________________________________________
>> >>> >> > SciPy-User mailing list
>> >>> >> > SciPy-User@scipy.org
>> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >> >
>> >>> >> >
>> >>> >>
>> >>> >> --
>> >>> >> View this message in context:
>> >>> >>
>> http://old.nabble.com/removing-for-loops...-tp28633477p28711581.html
>> >>> >> Sent from the Scipy-User mailing list archive at Nabble.com.
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> SciPy-User mailing list
>> >>> >> SciPy-User@scipy.org
>> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >>
>> >>> >
>> >>> > _______________________________________________
>> >>> > SciPy-User mailing list
>> >>> > SciPy-User@scipy.org
>> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>> >
>> >>> >
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28824023.html
>> >>> Sent from the Scipy-User mailing list archive at Nabble.com.
>> >>>
>> >>> _______________________________________________
>> >>> SciPy-User mailing list
>> >>> SciPy-User@scipy.org
>> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>>
>> >>
>> >>
>> >
>> > _______________________________________________
>> > SciPy-User mailing list
>> > SciPy-User@scipy.org
>> > http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/removing-for-loops...-tp28633477p28846602.html
>> Sent from the Scipy-User mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User@scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
> 
> _______________________________________________
> SciPy-User mailing list
> SciPy-User@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
> 
> 

-- 
View this message in context: http://old.nabble.com/removing-for-loops...-tp28633477p28848191.html
Sent from the Scipy-User mailing list archive at Nabble.com.



More information about the SciPy-User mailing list