[SciPy-user] Equivalent to 'match' function in R?
Wes McKinney
wesmckinn@gmail....
Thu Jul 24 10:12:12 CDT 2008
I did a 'super naive' version of this:
def match(a, b):
bmap = dict([(v, i) for i, v in enumerate(b)])
res = empty(len(a))
for i, val in enumerate(a):
res[i] = bmap.get(val, NaN)
return res
Runs pretty slow for a test case, matching arange(20000) with a shuffled
version of itself
In [28]: timeit match(a, b)
10 loops, best of 3: 49.9 ms per loop
Same slightly less naive implementation done all with Cython and working
only with ndarrays:
In [30]: timeit cmatch(a, b)
100 loops, best of 3: 10.3 ms per loop
I don't know how to compare performance of this with R, assume it's pretty
comparable. The only thing that is kind of bust is that values not found in
the target array get translated to NA in R, but NaN's get translated to 0 as
numpy ints, you can't index an array with an array containing NaN's anyhow.
Hmm.
On Thu, Jul 24, 2008 at 9:49 AM, Arnar Flatberg <arnar.flatberg@gmail.com>
wrote:
>
>
> On Thu, Jul 24, 2008 at 3:00 PM, Wes McKinney <wesmckinn@gmail.com> wrote:
>
>> Hi all,
>>
>> I've been working with users lately who are transitioning from using R to
>> NumPy/Scipy. Some are accustomed to using the 'match' function, for
>> example:
>>
>> > allData <- cbind(c(1,2,3,4,5), c(12, 19, 27, 38, 51))
>> > allData
>> [,1] [,2]
>> [1,] 1 12
>> [2,] 2 19
>> [3,] 3 27
>> [4,] 4 38
>> [5,] 5 51
>> > subData <- cbind(c(3,5,1), c(NA, NA, NA))
>> > subData
>> [,1] [,2]
>> [1,] 3 NA
>> [2,] 5 NA
>> [3,] 1 NA
>>
>
> What about using `intersect` combined with `where` ?
>
> all_data = np.array([[1,2,3,4,5], [12,19,27,38,51]]).T
> sub_data = np.array([[3,5,1], [nan,nan,nan]]).T
> match_ind = np.where(np.intersect_1d(sub_data[:,0], all_data[:,0]))
> sub_data[:,1] = all_data[match_ind,1]
>
> It may not be pretty or the best approach for solving the above examples
> but it behaves like R's match somewhat.
>
> Arnar
>
>
> _______________________________________________
> SciPy-user mailing list
> SciPy-user@scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://projects.scipy.org/pipermail/scipy-user/attachments/20080724/eda20beb/attachment.html
More information about the SciPy-user
mailing list