[Numpy-discussion] scan array to extract min-max values (with if condition)

Massimo Di Stefano massimodisasha@gmail....
Sat Sep 11 14:53:27 CDT 2010


Brett,


i tried a different way to solve the problem, using :

#############
import os

fpath = '/Users/sasha/py/'
input_fp = open( os.path.join(fpath, 'BE3730072600WC20050817.txt'), 'r' )
input_file = input_fp.readlines()

N = 234560.94503118 
S = 234482.56929822 
E = 921336.53116178 
W = 921185.3779625

xL = []
yL = []
zL = []

for index, line in enumerate( input_file ):
    if index == 0:
        print 'skipping header line...'
    else:
        x, y, z = line.split(',')
        xL.append( float(x) * 0.3048006096012 )
        yL.append( float(y) * 0.3048006096012 )
        zL.append( float(z) * 0.3048006096012 )

xLr = []
yLr = []
zLr = []

for coords in zip(xL, yL, zL):  
    if W < coords[0] < E and S < coords[1] < N:
        xLr.append( coords[0] )
        yLr.append( coords[1] )
        zLr.append( coords[2] )

elements = len(xLr)
minZ = min(zLr)
maxZ = max(zLr)

############

using the same input file i posted early,
it give me  966 elements 
instead of 158734 elements gived by your "MASK" example  

the input file contains 158734 elements, this means the mask code :


> 	mask |= mydata[:,0] < E
> 	mask |= mydata[:,0] > W
> 	mask |= mydata[:,1] < N
> 	mask |= mydata[:,1] > S

is not working as aspected


have you hints on how to get working the "MASK" code ?
as it is now it pick all the points in the "mydata" array.


thanks!

Massimo.

Il giorno 11/set/2010, alle ore 16.19, Brett Olsen ha scritto:

> On Sat, Sep 11, 2010 at 7:45 AM, Massimo Di Stefano
> <massimodisasha@gmail.com> wrote:
>> Hello All,
>> 
>> i need to extract data from an array, that are inside a
>> rectangle area defined as :
>> 
>> N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625
>> 
>> the data are in a csv (comma delimited text file, with 3 columns X,Y,Z)
>> 
>> #X,Y,Z
>> 3020081.5500,769999.3100,0.0300
>> 3020086.2000,769991.6500,0.4600
>> 3020099.6600,769996.2700,0.9000
>> ...
>> ...
>> 
>> i read it using " numpy.loadtxt "
>> 
>> data :
>> 
>> http://www.geofemengineering.it/data/csv.txt     5,3 mb (158735 rows)
>> 
>> to extract data that are inside the boundy-box area (N, S, E, W) i'm using a loop
>> inside a function like :
>> 
>> import numpy as np
>> 
>> def getMinMaxBB(data, N, S, E, W):
>>        mydata = data * 0.3048006096012
>>        for i in range(len(mydata)):
>>                if mydata[i,0] < E or mydata[i,0] > W or mydata[i,1] < N or mydata[i,1] > S :
>>                        if i == 0:
>>                                newdata = np.array((mydata[i,0],mydata[i,1],mydata[i,2]), float)
>>                        else :
>>                                newdata = np.vstack((newdata,(mydata[i,0], mydata[i,1], mydata[i,2])))
>>        results = {}
>>        results['Max_Z'] = newdata.max(0)[2]
>>        results['Min_Z'] = newdata.min(0)[2]
>>        results['Num_P'] = len(newdata)
>>        return results
>> 
>> 
>> N, S, E, W = 234560.94503118, 234482.56929822, 921336.53116178, 921185.3779625
>> data = '/Users/sasha/csv.txt'
>> mydata = np.loadtxt(data, comments='#', delimiter=',')
>> out = getMinMaxBB(mydata, N, S, E, W)
>> 
>> print out
> 
> Use boolean arrays to index the parts of your array that you want to look at:
> 
> def newGetMinMax(data, N, S, E, W):
> 	mydata = data * 0.3048006096012
> 	mask = np.zeros(mydata.shape[0], dtype=bool)
> 	mask |= mydata[:,0] < E
> 	mask |= mydata[:,0] > W
> 	mask |= mydata[:,1] < N
> 	mask |= mydata[:,1] > S
> 	results = {}
> 	results['Max_Z'] = mydata[mask,2].max()
> 	results['Min_Z'] = mydata[mask,2].min()
> 	results['Num_P'] = mask.sum()
> 	return results
> 
> This runs about 5000 times faster on my machine.
> 
> Brett
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20100911/ae0d3722/attachment.html 


More information about the NumPy-Discussion mailing list