Saturday 15 May 2010

r - Farthest element within limited distance for a sorted vector -



r - Farthest element within limited distance for a sorted vector -

we have sorted vector foo, each element i want find largest j such foo[j]-foo[i] < 10. instance when

foo <- c(1,2,5,7,13,17,25,33,85)

the reply is:

bar <- c(4,4,5,5,6,7,8,8,9)

(for i=1, largest j 4 since foo[4]-foo[1]=7-1<10. hence first item of bar 4).

we can compute bar using for , while loop. looking efficient code in r. ideas?

here's method scale better. using overlapping range joins function foverlaps() data.table version 1.9.4:

require(data.table) ## 1.9.4+ x = data.table(start=foo, end=foo+9l) lookup = data.table(start=foo, end=foo) setkey(lookup) ## order doesn't change, 'foo' sorted foverlaps(x, lookup, mult="last", which=true) # [1] 4 4 5 5 6 7 8 8 9

timing on 100,000 numbers:

set.seed(45l) foo <- sort(sample(1e6, 1e5, false)) arun <- function(foo) { x = data.table(start=foo, end=foo+9l) lookup = data.table(start=foo, end=foo) setkey(lookup) foverlaps(x, lookup, mult="last", which=true) } system.time(arun(foo)) # user scheme elapsed # 0.142 0.009 0.153

r vector

No comments:

Post a Comment