Wednesday 15 August 2012

stockquotes - Data Manipulation in R, Stock Data Transformation -



stockquotes - Data Manipulation in R, Stock Data Transformation -

i create next data.frame classic "time * ohlc" info format of exchange data.

starting point next info frame:

date time open high low close 01/28/2002 0833 543.81 543.82 543.84 543.85 01/28/2002 0850 542.95 542.95 542.95 542.95 01/28/2002 0901 542.45 542.45 542.45 542.45 01/28/2002 0911 542.45 542.45 542.45 542.45

there 1534129 rows in table. little bit desperate moving info next structure:

date time cost 01/28/2002 0833 543.81 01/28/2002 0833 543.82 01/28/2002 0833 543.84 01/28/2002 0833 543.85 01/28/2002 0850 542.95

that way how first line should rewritten , extension should repeated on every line of original file. sec part of task set parameter (distribution) going decide whether high or low comes first during bar creation phase. of course of study has farther implication on info manipulation later, can't starting point yet.

later, work code , decide how info when chose high, low created first (and opposite), or hardest thing, because not done deterministically, version distribution decide goes first.

hopefully describes task (question) exactly. glad every tip, or idea. give thanks help.

try

library(tidyr) library(dplyr) df1 <- df %>% gather(var, price, open:close) %>% arrange(date, time) %>% select(-var) head(df1) # date time cost #1 01/28/2002 0833 543.81 #2 01/28/2002 0833 543.82 #3 01/28/2002 0833 543.84 #4 01/28/2002 0833 543.85 #5 01/28/2002 0850 542.95 #6 01/28/2002 0850 542.95 data df <-structure(list(date = c("01/28/2002", "01/28/2002", "01/28/2002", "01/28/2002"), time = c("0833", "0850", "0901", "0911"), open = c(543.81, 542.95, 542.45, 542.45), high = c(543.82, 542.95, 542.45, 542.45 ), low = c(543.84, 542.95, 542.45, 542.45), close = c(543.85, 542.95, 542.45, 542.45)), .names = c("date", "time", "open", "high", "low", "close"), row.names = c(na, -4l), class = "data.frame")

r stockquotes

No comments:

Post a Comment