sequences - Fill in Missing Weekly Data Points in R as 0 for Each ID -
i have info in shape:
> head(posts) id week_number num_posts 1 ukl1.1 1 4 2 ukl1.1 6 9 3 ukl1.2 1 2 4 ukl1.3 1 8 5 ukl1.3 2 7 6 ukl1.3 3 3
and want create such each id
has row each week_number
(1,2,3,4,5,6) , if week_number
isn't in info posts
should = 0
i've seen done using bundle zoo
true time-series data, without creating proper posixct
or date
version of week_number
, using bundle there way directly?
here's way using data.table
.
library(data.table) setdt(posts) # convert posts data.table all.wks <- posts[,list(week_number=min(week_number):max(week_number)),by=id] setkey(posts,id,week_number) # index on id , week number setkey(all.wks,id,week_number) # index on id , week number result <- posts[all.wks] # data.table bring together fast result[is.na(num_posts),num_posts:=0] # convert na 0 result # id week_number num_posts # 1: ukl1.1 1 4 # 2: ukl1.1 2 0 # 3: ukl1.1 3 0 # 4: ukl1.1 4 0 # 5: ukl1.1 5 0 # 6: ukl1.1 6 9 # 7: ukl1.2 1 2 # 8: ukl1.3 1 8 # 9: ukl1.3 2 7 # 10: ukl1.3 3 3
another way:
my_fun <- function(x) { weeks = with(x, min(week_number):max(week_number)) posts = with(x, num_posts[match(weeks, week_number)]) list(week_number=weeks, num_posts=posts) } setdt(posts)[, my_fun(.sd), by=id]
.sd
means subset of data; contains info subset corresponding each grouping specified in by
, columns excluding grouping column = id
.
then can replace na
s shown above.
r sequences
No comments:
Post a Comment