Friday 15 March 2013

r - Converting data to a 2 column format? -



r - Converting data to a 2 column format? -

if have dataset following:

la ny ma 1 2 3 4 5 6 3 5 4

(in other words, each row has different structure. la has 3 values, ny has 4 values, etc.) trying utilize lm perform anova test (to decide whether mean number same in each state), , keeps showing "an error occurred" because rows not match. 1 thought got convert info 2-column format. command/package should utilize perform task?

edit: info txt file.

another alternative after read file convert 2-column format

df <- read.table("betty.txt", header=true, fill=true, sep="\t") ## (as @richard scriven mentioned in comment) na.omit(stack(df)) # values ind #1 1 la #2 4 la #3 3 la #5 2 ny #6 5 ny #7 5 ny #8 4 ny #9 3 ma #10 6 ma update

the above got transforming info have \t delimiter. but, if file copy/pasted straight op's post without alter (making sure there spaces 3rd , 4th row after 2nd column)

lines <- readlines('betty1.txt') lines2 <- gsub("(?<=[^ ]) +|^[ ]+(?<=[ ])(?=[^ ])", ",", lines, perl=true) lines2 #[1] "la,ny,ma" "1,2,3" "4,5,6" "3,5," ",4," df1 <- read.table(text=lines2, sep=',', header=true) df1 # la ny ma #1 1 2 3 #2 4 5 6 #3 3 5 na #4 na 4 na

and

na.omit(stack(df1)) update2

another alternative if have fixed width columns utilize read.fwf

df <- read.fwf('betty1.txt', widths=c(3,3,3), skip=1) colnames(df) <- scan('betty1.txt', nlines=1, what="", quiet=true) df # la ny ma #1 1 2 3 #2 4 5 6 #3 3 5 na #4 na 4 na library(tidyr) gather(df, var, val, la:ma, na.rm=true) # var val #1 la 1 #2 la 4 #3 la 3 #4 ny 2 #5 ny 5 #6 ny 5 #7 ny 4 #8 ma 3 #9 ma 6

r

No comments:

Post a Comment