Monday, 15 July 2013

sequences - Fill in Missing Weekly Data Points in R as 0 for Each ID -



sequences - Fill in Missing Weekly Data Points in R as 0 for Each ID -

i have info in shape:

> head(posts) id week_number num_posts 1 ukl1.1 1 4 2 ukl1.1 6 9 3 ukl1.2 1 2 4 ukl1.3 1 8 5 ukl1.3 2 7 6 ukl1.3 3 3

and want create such each id has row each week_number (1,2,3,4,5,6) , if week_number isn't in info posts should = 0

i've seen done using bundle zoo true time-series data, without creating proper posixct or date version of week_number , using bundle there way directly?

here's way using data.table.

library(data.table) setdt(posts) # convert posts data.table all.wks <- posts[,list(week_number=min(week_number):max(week_number)),by=id] setkey(posts,id,week_number) # index on id , week number setkey(all.wks,id,week_number) # index on id , week number result <- posts[all.wks] # data.table bring together fast result[is.na(num_posts),num_posts:=0] # convert na 0 result # id week_number num_posts # 1: ukl1.1 1 4 # 2: ukl1.1 2 0 # 3: ukl1.1 3 0 # 4: ukl1.1 4 0 # 5: ukl1.1 5 0 # 6: ukl1.1 6 9 # 7: ukl1.2 1 2 # 8: ukl1.3 1 8 # 9: ukl1.3 2 7 # 10: ukl1.3 3 3

another way:

my_fun <- function(x) { weeks = with(x, min(week_number):max(week_number)) posts = with(x, num_posts[match(weeks, week_number)]) list(week_number=weeks, num_posts=posts) } setdt(posts)[, my_fun(.sd), by=id]

.sd means subset of data; contains info subset corresponding each grouping specified in by, columns excluding grouping column = id.

then can replace nas shown above.

r sequences

No comments:

Post a Comment