Saturday, 15 June 2013

data.frame - reshape data frame in R -



data.frame - reshape data frame in R -

i have info frame need reshape, transforming repeated values in single column single row several info columns. know should simple can't figure out how this, , of many reshape/cast functions available need use.

part of info looks this:

source id info 1 in 842701 1 2 out 842701 1 3 in 21846591 2 4 out 21846591 2 5 in 22181760 3 6 in 39338740 4 7 out 9428 5

i want create this:

id in out info 1 842701 1 1 1 2 21846591 1 1 2 3 22181760 1 0 3 4 39338740 1 0 4 5 9428 0 1 5

and on, while preserving remaining columns (which identical given entry).

i appreciate help. tia.

here way using reshape2

library(reshape2) res <- dcast(transform(df, indx=1, id=factor(id, levels=unique(id))), id~source, value.var="indx", fill=0) res # id in out #1 842701 1 1 #2 21846591 1 1 #3 22181760 1 0 #4 39338740 1 0 #5 9428 0 1

or

res1 <- as.data.frame.matrix(table(transform(df, id=factor(id, levels=unique(id)))[,2:1])) update dcast(transform(df1, indx=1, id=factor(id, levels=unique(id))), ...~source, value.var="indx", fill=0) # id info in out #1 842701 1 1 1 #2 21846591 2 1 1 #3 22181760 3 1 0 #4 39338740 4 1 0 #5 9428 5 0 1

you utilize reshape base r

res2 <- reshape(transform(df1, indx=1), idvar=c("id", "info"), timevar="source", direction="wide") res2[,3:4][is.na(res2)[,3:4]] <- 0 res2 # id info indx.in indx.out #1 842701 1 1 1 #3 21846591 2 1 1 #5 22181760 3 1 0 #6 39338740 4 1 0 #7 9428 5 0 1 data df <- structure(list(source = c("in", "out", "in", "out", "in", "in", "out"), id = c(842701l, 842701l, 21846591l, 21846591l, 22181760l, 39338740l, 9428l)), .names = c("source", "id"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7")) df1 <- structure(list(source = c("in", "out", "in", "out", "in", "in", "out"), id = c(842701l, 842701l, 21846591l, 21846591l, 22181760l, 39338740l, 9428l), info = c(1l, 1l, 2l, 2l, 3l, 4l, 5l)), .names = c("source", "id", "info"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7"))

r data.frame

No comments:

Post a Comment