Sunday, 15 May 2011

r - List operations on variables containing string -



r - List operations on variables containing string -

i have admit i'm totally stumped on one, apologies no clear attempt, though can inquire question clearly:

i have list of dataframes. of them, there multiple date-type variables need formatted date (e.g., as.date(data$var, format = "%m/%d/%y")).

the problem date variable named different in each of dataframes. in illustration below, we've got "start_date" , "end_date".

is there way write function operates on variable names in dataframe , if finds text contains "date", formatting operation?

the dataframes:

west <- data.frame( spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), start_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), country = sample(c("usa","canada","uk"),50,replace = t) ) east <- data.frame( end_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), country = sample(c("china","japan","skorea"),50,replace = t) )

and turning them list (in reality, much larger list):

combined <- c(west,east)

how take logical vector grepl statement , tell operate on variable logical vector "true" across list elements?

grepl("date", names(combined)) [1] false false true false false false true false

try

lst1 <- lapply(list(west, east), function(x) { indx <- grepl("date", names(x)) x[,indx] <- as.date(x[,indx], format="%m/%d/%y") x })

in case need update individual objects ie. east, west etc. (which not needed because of operations including saving file write.csv/write.table can done within list using lapply)

list2env(setnames(lst1, c("west", "east")), envir=.globalenv) update

if there multiple variables date

east <- data.frame( end_date = sample(c("06/07/14","06/08/14","06/09/14"), 50, replace=t), new_date = sample(c("06/07/14","06/12/14","06/09/14"), 50, replace=t), spend = sample(50:100,50,replace=t), trials = sample(100:200,50,replace=t), country = sample(c("china","japan","skorea"),50,replace = t)) lst2 <- lapply(list(west, east), function(x) { indx <- grepl("date", names(x)) x[,indx] <- lapply(x[,indx,drop=false], as.date, format="%m/%d/%y") x}) lapply(lst2, head,2) #[[1]] # spend trials start_date country #1 83 188 2014-06-09 usa #2 83 107 2014-06-08 usa #[[2]] # end_date new_date spend trials country #1 2014-06-08 2014-06-12 53 144 china #2 2014-06-08 2014-06-09 100 118 china

r

No comments:

Post a Comment