[SciPy-User] More efficient way of sorting and filtering structuredarray.
Dharhas Pothina
Dharhas.Pothina@twdb.state.tx...
Wed Aug 4 12:26:35 CDT 2010
Just to add,
one idea I had was to use argsort to sort the array by FILENAME, LINENAME and pdist. This would result in an array that had blocks with common FILENAME and LINENAME. I would then have to select the first line of each block which I'm not sure there is a easy way to do.
- dharhas
>>> "Dharhas Pothina" <Dharhas.Pothina@twdb.state.tx.us> 8/4/2010 11:52 AM >>>
Hi,
I have a structured array that contains intersection points between two sets of lines (specified by LINENAME and FILENAME). For each unique combination of LINENAME and FILENAME, there are a number of matches in the array intersection_points and I need only the closest match (i.e. the pdist field in the array is the smallest). The following snippet works but is extremely slow. I was wondering if there is a more efficient way to do this.
# Creates closest_points array with zeros
closest_points = np.zeros(0,dtype=pnt_dtype)
# loops through unique linenames of intersection_points
for linename in np.unique(intersection_points['LINENAME']):
# loops through unique filenames of intersection_points
for filename in np.unique(intersection_points['FILENAME']):
# create seperate temporary arrays from intersection_points matching linename THEN filename
idx_line = intersection_points['LINENAME'] == linename
idx_file = intersection_points['FILENAME'] == filename
# create temporary array from only points of correct linename AND filename
tmp_points = intersection_points[idx_line * idx_file]
# Eliminates empty array errors by making sure something is present
if tmp_points.size > 0:
# sort tmp_points array by pdist
idx_sort = np.argsort(tmp_points, order='pdist')
# add closest tmp_point to bottom of closest_points file
closest_points = np.hstack((closest_points,tmp_points[idx_sort][0]))
for reference pnt_dtype is :
pnt_dtype = np.dtype([('lon','f8'),('lat','f8'),('x','f8'),('y','f8'),
('FILENAME','S50'),
('FILE_NUM','i4'),
('SSID','i8'),
('Lidx', 'i4'),
('pdist', 'f8'),
('LINENAME','S50'),
('HE_Code','i4'),
('PairNum','i4'),
('dist', 'S50')
])
thanks,
- dharhas
