Map a Ruby array of timeseries values to a weighted interval -
i have ruby array of arrays represents series of observations of metric that's recorded on time. each inner array has 2 elements:
atime
instance in utc describing when observation recorded the integer value of observation for example, might have like:
[ [<time: 2014-01-15 @ 18:00>, 100], [<time: 2014-01-16 @ 06:00>, 200], [<time: 2014-01-16 @ 12:00>, 300], [<time: 2014-01-16 @ 23:00>, 400], [<time: 2014-01-17 @ 12:00>, 500], [<time: 2014-01-18 @ 03:00>, 600], [<time: 2014-01-18 @ 06:00>, 700], ]
the problem @ hand turn array of weighted values each date:
[ [<date: 2014-01-15>, 100], [<date: 2014-01-16>, 229], ... ]
the value each day in above array obtained next procedure:
break day series of intervals delimited each observation , boundaries of day.
for example, since jan 16th has observations @ 06:00, 12:00, , 23:00, broken intervals of 00:00-06:00, 06:00-12:00, 12:00-23:00, , 23:00-00:00.
the value of each interval equal value of observation @ origin of interval, or lastly observation made if it's start of day.
for example, value of 06:00-12:00 interval on jan 16th 200, since value of 200 recorded @ 06:00.
the value of 00:00-06:00 interval on jan 15th 100, since value of 100 lastly observation recorded @ point day started.
the weighted value of each interval equal value multiplied fraction of lengths of intervals in day occupied.
for example, weighted value of 06:00-12:00 interval on jan 16th 50 (200 * 0.25).
the final weighted value of each day sum of weighted values of intervals, coerced integer.
for example, weighted value jan 16th 229, because:
(100*(6/24) + 200*(6/24) + 300*(11/24) + 400*(1/24)).to_i = 229
the first point in array special case: day starts there, rather @ 00:00, jan 15th has 1 interval: 18:00-00:00 value of 100, weighted value 100.
any suggestions on how started tackling this?
i've assumed there no days no entries.
i found convenient first transform array of time
objects. rules used transformation follows (arb
refers arbitrary value, may equal val
):
[dt, val]
3 elements: [dt1, val]
, dt1
same date @ time 00:00:00
[dt2, arb]
, dt2
same date @ time 23:59:59
[dt3, val]
, dt3
1 day later @ time 00:00:00
for lastly day, if [dt, val]
lastly element day, add together element [dt1, arb]
, dt
same date @ time 23:59:59
. for every day other first , last, if [dt, val]
lastly element day, add together 2 elements: [dt1, arb]
, dt1
same date @ time 23:59:59
[dt2, val]
, dt2
1 day later @ time 00:00:00
suppose next initial array. clarity, i've used strings (allowing me replace "23:59:59"
"24:00"
):
arr = [ ["2014-01-15 18:00", 100], ["2014-01-16 06:00", 200], ["2014-01-16 12:00", 300], ["2014-01-16 23:00", 400], ["2014-01-17 12:00", 500], ["2014-01-18 03:00", 600], ["2014-01-18 06:00", 700] ]
after applying above rules, obtain:
arr1 = [ ["2014-01-15 00:00", 100], ["2014-01-15 24:00", 100], ["2014-01-16 00:00", 100], ["2014-01-16 06:00", 200], ["2014-01-16 12:00", 300], ["2014-01-16 23:00", 400], ["2014-01-16 24:00", 400], ["2014-01-17 00:00", 400], ["2014-01-17 12:00", 500], ["2014-01-17 24:00", 500], ["2014-01-18 00:00", 500], ["2014-01-18 03:00", 600], ["2014-01-18 06:00", 700], ["2014-01-18 24:00", 700] ]
or elements grouped date,
arr1 = [ ["2014-01-15 00:00", 100], ["2014-01-15 24:00", 100], ["2014-01-16 00:00", 100], ["2014-01-16 06:00", 200], ["2014-01-16 12:00", 300], ["2014-01-16 23:00", 400], ["2014-01-16 24:00", 400], ["2014-01-17 00:00", 400], ["2014-01-17 12:00", 500], ["2014-01-17 24:00", 500], ["2014-01-18 00:00", 500], ["2014-01-18 03:00", 600], ["2014-01-18 06:00", 700], ["2014-01-18 24:00", 700] ]
code implement these rules should straightforward. 1 time have arr1
, create enumerator enumerable#chunk:
enum = arr1.chunk { |a| a.first[0,10] } #=> #<enumerator: #<enumerator::generator:0x000001010e30d8>:each>
let's see elements of enum
:
enum.to_a #=> [["2014-01-15", [["2014-01-15 00:00", 100], ["2014-01-15 24:00", 100]]], # ["2014-01-16", [["2014-01-16 00:00", 100], ["2014-01-16 06:00", 200], # ["2014-01-16 12:00", 300], ["2014-01-16 23:00", 400], # ["2014-01-16 24:00", 400]]], # ["2014-01-17", [["2014-01-17 00:00", 400], ["2014-01-17 12:00", 500], # ["2014-01-17 24:00", 500]]], # ["2014-01-18", [["2014-01-18 00:00", 500], ["2014-01-18 03:00", 600], # ["2014-01-18 06:00", 700], ["2014-01-18 24:00", 700]]]]
now need map each element (one per date) weighted average of val
s (noting don't utilize first element of each element of enum
):
enum.map { |_,arr| (arr.each_cons(2) .reduce(0.0) { |t,((d1,v1),(d2,_))| t + min_diff(d2,d1)*v1 }/1440.0).round(2) } #=> [100.0, 229.17, 450.0, 662.5]
using helper:
def min_diff(str1, str2) 60*(str1[-5,2].to_i - str2[-5,2].to_i) + str1[-2,2].to_i - str2[-2,2].to_i end
putting together:
arr1.chunk { |a| a.first[0,10] } .map { |_,arr| (arr.each_cons(2) .reduce(0.0) { |t,((d1,v1),(d2,_))| t + min_diff(d2,d1)*v1 }/1440.0).round(2) } #=> [100.0, 229.17, 450.0, 662.5]
along helper min_diff
.
ruby arrays time
No comments:
Post a Comment