# [Numpy-discussion] Determine slices in a sorted array

Laszlo Nagy gandalf@shopzeus....
Thu Jul 1 15:13:50 CDT 2010

```Given an array with two axes, sorted by a column 'SLICE_BY', how can I
extract slice indexes for rows with the same 'SLICE_BY' value?

Here is an example program, demonstrating the problem:

from numpy import *

a = random.randint(0,100,(20,4))
SLICE_BY = 0 # Make slices of array 'a' by column SLICE_BY

a.sort(SLICE_BY)
slices = []
prev_val = None
sidx = -1
for rowidx,row in enumerate(a):
val = row[SLICE_BY]
if val!=prev_val:
if prev_val is None:
prev_val = val
sidx = rowidx
else:
slices.append((prev_val,sidx,rowidx))
sidx = rowidx
prev_val = val

if sidx<a.shape[0]-1:
slices.append((val,sidx,a.shape[0]))

print a
print slices

This program would print:

[[ 1  0  8  1]
[ 4  5 17  9]
[ 4 11 19 23]
[11 12 24 23]
[13 16 28 23]
[14 26 29 36]
[15 33 32 37]
[20 38 38 40]
[28 47 47 45]
[33 50 50 57]
[45 55 52 65]
[47 67 60 65]
[56 76 71 68]
[61 76 71 78]
[70 83 82 83]
[89 83 84 85]
[91 84 85 87]
[95 96 86 88]
[98 96 89 88]
[99 98 92 88]]
[(1, 0, 1), (4, 1, 3), (11, 3, 4), (13, 4, 5), (14, 5, 6), (15, 6, 7),
(20, 7, 8), (28, 8, 9), (33, 9, 10), (45, 10, 11), (47, 11, 12), (56,
12, 13), (61, 13, 14), (70, 14, 15), (89, 15, 16), (91, 16, 17), (95,
17, 18), (98, 18, 19)]

Altough my demonstration program is functionally correct, it is not
efficient. I need to do this with 10 million rows. Number of slices is
relatively small (10 to 10000).

Is is possible to construct my "slices" with pure numpy functions? E.g.
anything that does not involve big number of python bytecode
instructions, constucting Python objects, referencing/dereferencing 10
million times etc.

Thanks,

Laszlo

```