Calculating mean for class variables in python dataframe -
i have dataframe of session log-in data. each entry associated class (e, c, g, m). rows this:
1: [session_start_time session_end_time class_id problems_completed student_id student_account_created student_previous_logins_total student_previous_class_logins duration] 2: [1/6/12 16:28 1/6/12 16:55 e 37 91 10/26/11 0:00 76 27 1/1/04 0:27] 3: [1/11/12 13:18 1/11/12 13:58 m 33 172 1/10/12 0:00 5 3 1/1/04 0:40]
i trying calculate average "duration" each class (e, c, g, etc.). having problem finding right command calculate average per class, rather mean of whole column.
i not sure info format/structure mean source info in, since nowadays not exact python representation. let's assume rows lists of strings (or can converted them):
rows = [ [ '1/6/12 16:28', '1/6/12 16:55', 'e' ], [ '1/11/12 13:18', '1/11/12 13:58', 'm' ], [ '1/13/12 13:20', '1/13/12 13:24', 'm' ] ]
then, here's 1 way compute mean class:
from collections import counter datetime import datetime def parse(s, format="%x %h:%m"): """ homecoming parsed datetime in given format. """ homecoming datetime.strptime(s, format) total_items = counter() total_duration = counter() start, end, kind in rows: duration = parse(end) - parse(start) total_items[kind] += 1 total_duration[kind] += duration.total_seconds() means = { k: total_duration[k] / total_items[k] k in total_items } print means
this uses collections.counter
s track both count of each class in log , duration. duration must computed, first parsing date/time string representation internal format datetime.datetime
. 1 time counters accumulated, dictionary comprehension computes mean per kind (what phone call "class" that's technical python construct, phone call kind).
the resulting means
stores computed values. means['m']
gives mean of 'm'
entries, , forth.
while parse
function work few info samples showed in question, date/time parsing pretty finicky. instead of using strptime
method here, recommend using more expansive , inclusive parser, such found in dateutil module. if wanted utilize that, delete or rename parse
function found here, , substitute:
from dateutil.parser import parse
that provides drop-in replacement much broader range of accepted formats.
python class mean
No comments:
Post a Comment