Monday, 15 February 2010

All combinations from data frame in R -



All combinations from data frame in R -

i'm trying find combinations (not permutations, order doesn't matter) list various restrictions on construction of each combination. know combn() trick simple list , i've tried using sample(), need more complex.

i have info frame has 3 columns, name, type, cost. want find possible combinations of names in sets of 7 (so 7 names) 1 of type 1, 3 of type 2 , rest of type 3 , total cost less set variable.

i'm @ total loss how , i'm not r right language in. should seek loop nested if statements?

> dput(head(sample)) structure(list(name = structure(c(6l, 8l, 4l, 9l, 2l, 5l), .label = c("amber", "cyndi", "e", "eric", "hannah", "jason", "jesse", "jim ", "lisa", "lucy", "matt", "ryan", "tat"), class = "factor"), type = c(2l, 3l, 3l, 1l, 3l, 3l), cost = c(6000l, 6200l, 9000l, 2000l, 8000l, 4500l)), .names = c("name", "type", "cost"), row.names = c(na, 6l), class = "data.frame")

and sessioninfo()

> sessioninfo() r version 3.1.1 (2014-07-10) platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_us.utf-8/en_us.utf-8/en_us.utf-8/c/en_us.utf-8/en_us.utf-8 attached base of operations packages: [1] stats graphics grdevices utils datasets methods base of operations other attached packages: [1] plyr_1.8.1 ggplot2_1.0.0 dplyr_0.2 loaded via namespace (and not attached): [1] assertthat_0.1 colorspace_1.2-4 digest_0.6.4 grid_3.1.1 gtable_0.1.2 mass_7.3-33 [7] munsell_0.4.2 parallel_3.1.1 proto_0.3-10 rcpp_0.11.2 reshape2_1.4 scales_0.2.4 [13] stringr_0.6.2 tools_3.1.1

example data:

name type cost jason 2 6000 jim 3 6200 eric 3 9000 lisa 1 2000 cyndi 3 8000 hannah 3 4500 e 2 7200 matt 1 3200 jesse 3 1200 tat 3 3200 ryan 1 5600 amber 2 5222 lucy 2 1000

one possible combination if total cost set 60k:

lisa, jason, amber, lucy, tat, jesse, hannah

that's 1 possible combination, lisa type 1, jason, amber , lucy type 2 , remaining 3 type 3 , total cost of 7 below 60k. possible combination be:

ryan, jason, amber, lucy, tat, jesse, hannah

ryan has replaced lisa type 1 first combination. cost still below 60k.

i'm trying possible combinations conditions above true.

one possible solution using loops (maybe not efficient):

# illustration info name <- c('jason', 'jim','eric', 'lisa', 'cyndi', 'hanna','jon','matt', 'jerry','emily','mary','cynthia') type <- c(2, 1, 3, 3, 2, 3, 3, 1, 2, 2, 3, 2) cost <- c(9200, 8200, 9000, 8700, 9100, 8900, 9800, 7800, 9600, 9300, 8100, 7800) df <- data.frame(name, type,cost) v1 <- subset(df, type==1) v2 <- subset(df, type==2) v3 <- subset(df, type==3) # combinations of desired size of subsets m1 <- v1$name m2 <- combn(v2$name, 3) m3 <- combn(v3$name, 3) n1 <- length(m1) n2 <- ncol(m2) n3 <- ncol(m3) # set combinations of subsets all.combs <- as.list(rep(na, n1*n2*n3)) idx <- 1 (i in 1:n1) { (j in 1:n2) { (k in 1:n3) { all.combs[[idx]] <- c(as.character(m1[i]), as.character(m2[,j]), as.character(m3[,k])) idx <- idx + 1 } } } # check total cost < 60k cond <- rep(na, length(all.combs)) (i in 1:length(all.combs)) { sum <- 0 (j in 1:7) { sum <- sum + df$cost[df$name==all.combs[[i]][j]] } cond[i] <- sum < 60000 } res <- all.combs[cond] res

r combinations

No comments:

Post a Comment