Friday 15 April 2011

R function for counting how often a value falls below a particular value -



R function for counting how often a value falls below a particular value -

this proving monster me 0 experience in r script. have info frame 57 columns, 30 rows of data

here trying do:

1) go each column:

2) count number of times 2/3/4/5/6/7/8/9 consecutive values less -1

3) print result text file

4) repeat step 2 , 3 sec column , on

i looked around , on r stackoverflow

check number of times consecutive value appear based on criteria

this 1 column of data:

data<-c(-0.996,-1.111,-0.638,0.047,0.694,1.901,2.863,2.611,2.56,2.016,0.929,-0.153,-0.617,-0.143 0.199,0.556,0.353,-0.638,0.347,0.045,-0.829,-0.882,-1.143,-0.869,0.619,0.923,-0.474,0.227 0.394,0.789,1.962,1.132,0.1,-0.278,-0.303,-0.606,-0.705,-0.858,-0.723,-0.081,1.206,2.329 1.863,2.1,1.547,2.026,0.015,-0.441,-0.371,-0.304,-0.668,-0.953,-1.256,-1.185,-0.891,-0.569 0.485,0.421,-0.004,0.024,-0.39,-0.58,-1.178,-1.101,-0.882,0.01,0.052,-0.166,-1.703,-1.048 -0.718,-0.036,-0.561,-0.08,0.272,-0.041,-0.811,-0.929,-0.853,-1.047,0.431,0.576,0.642,1.62 2.324,1.251,1.384,0.195,-0.081,-0.335,-0.176,1.089,-0.602,-1.134,-1.356,-1.203,-0.795,-0.752 -0.692,-0.813,-1.172,-0.387,-0.079,-0.374,-0.157,0.263,0.313,0.975,2.298,1.71,0.229,-0.313 -0.779,-1.12,-1.102,-1.01,-0.86,-1.118,-1.211,-1.081,-1.156,-0.972)

when run next code:

for (col in 1:ncol(data)) { runs <- rle(data[,col]) print(runs$lengths[which(runs$values < -1)]) }

it gives me this:

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

it has counted number of values <-1 not runs. during wrong here?

(massive edit)

fixed info vector (was missing commas):

data <- c(-0.996,-1.111,-0.638,0.047,0.694,1.901,2.863,2.611,2.56,2.016,0.929,-0.153,-0.617,-0.143, 0.199,0.556,0.353,-0.638,0.347,0.045,-0.829,-0.882,-1.143,-0.869,0.619,0.923,-0.474,0.227, 0.394,0.789,1.962,1.132,0.1,-0.278,-0.303,-0.606,-0.705,-0.858,-0.723,-0.081,1.206,2.329, 1.863,2.1,1.547,2.026,0.015,-0.441,-0.371,-0.304,-0.668,-0.953,-1.256,-1.185,-0.891,-0.569, 0.485,0.421,-0.004,0.024,-0.39,-0.58,-1.178,-1.101,-0.882,0.01,0.052,-0.166,-1.703,-1.048, -0.718,-0.036,-0.561,-0.08,0.272,-0.041,-0.811,-0.929,-0.853,-1.047,0.431,0.576,0.642,1.62, 2.324,1.251,1.384,0.195,-0.081,-0.335,-0.176,1.089,-0.602,-1.134,-1.356,-1.203,-0.795,-0.752, -0.692,-0.813,-1.172,-0.387,-0.079,-0.374,-0.157,0.263,0.313,0.975,2.298,1.71,0.229,-0.313, -0.779,-1.12,-1.102,-1.01,-0.86,-1.118,-1.211,-1.081,-1.156,-0.972)

doing data < -1 gives logical vector, , can count runs of true & false:

runs <- rle(data < -1) print(runs) ## run length encoding ## lengths: int [1:21] 1 1 20 1 29 2 8 2 4 2 ... ## values : logi [1:21] false true false true false true ...

then extract length of true runs:

print(runs$lengths[which(runs$values)]) ## [1] 1 1 2 2 2 1 3 1 3 4

and, iterate on columns of info frame shown:

# create info frame sampled versions of info set.seed(1492) # repeatable df <- data.frame(v1=data, v2=sample(data, length(data), replace=true), v3=sample(data, length(data), replace=true), v4=sample(data, length(data), replace=true)) # extraction (col in 1:ncol(df)) { runs <- rle(df[, col] < -1) print(runs$lengths[which(runs$values)]) } ## [1] 1 1 2 2 2 1 3 1 3 4 ## [1] 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 ## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 ## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

r function

No comments:

Post a Comment