r - List operations on variables containing string -
i have admit i'm totally stumped on one, apologies no clear attempt, though can inquire question clearly:
i have list of dataframes. of them, there multiple date-type variables need formatted date (e.g., as.date(data$var, format = "%m/%d/%y")).
the problem date variable named different in each of dataframes. in illustration below, we've got "start_date" , "end_date".
is there way write function operates on variable names in dataframe , if finds text contains "date", formatting operation?
the dataframes:
west <- data.frame( spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), start_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), country = sample(c("usa","canada","uk"),50,replace = t) ) east <- data.frame( end_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), country = sample(c("china","japan","skorea"),50,replace = t) )
and turning them list (in reality, much larger list):
combined <- c(west,east)
how take logical vector grepl statement , tell operate on variable logical vector "true" across list elements?
grepl("date", names(combined)) [1] false false true false false false true false
try
lst1 <- lapply(list(west, east), function(x) { indx <- grepl("date", names(x)) x[,indx] <- as.date(x[,indx], format="%m/%d/%y") x })
in case need update individual objects ie. east
, west
etc. (which not needed because of operations including saving file write.csv/write.table
can done within list using lapply
)
list2env(setnames(lst1, c("west", "east")), envir=.globalenv)
update if there multiple variables date
east <- data.frame( end_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), new_date = sample(c("06/07/14","06/12/14","06/09/14"), 50, replace=t), spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), country = sample(c("china","japan","skorea"),50,replace = t)) lst2 <- lapply(list(west, east), function(x) { indx <- grepl("date", names(x)) x[,indx] <- lapply(x[,indx,drop=false], as.date, format="%m/%d/%y") x}) lapply(lst2, head,2) #[[1]] # spend trials start_date country #1 83 188 2014-06-09 usa #2 83 107 2014-06-08 usa #[[2]] # end_date new_date spend trials country #1 2014-06-08 2014-06-12 53 144 china #2 2014-06-08 2014-06-09 100 118 china
r
No comments:
Post a Comment