Breedlove: python - expand each row to multiple rows in pandas using dataframe.apply (similar to MapReduce) -

Friday 15 March 2013

python - expand each row to multiple rows in pandas using dataframe.apply (similar to MapReduce) -

here's simplified version of problem. have dataframe has start , end locations of trips. want end dataframe has each station number of arrivals , departures.

i familiar mapreduce-like workflows, in map phase can take in 1 row , output multiple rows, , aggregate on rows in cut down phase.

here's code have now, not work.

import pandas pd import numpy np  def expand_row(row):    homecoming pd.series(     { 'station': [row['start_station'], row['end_station']],       'departures': [1, 0],       'arrivals': [0, 1],     },   )  trips = pd.dataframe({   'start_station': ['a', 'c'],   'end_station': ['b', 'a'], })  expanded = df.apply(expand_row, axis=1) aggregated = expanded.groupby('station').aggregate(np.sum)

what want final dataframe is

desired_df = pd.dataframe({   'station': ['a', 'b', 'c'],   'departures': [1, 0, 1],   'arrivals': [1, 1, 0] }) desired_df.index = desired_df.pop('station')

many thanks.

import pandas pd trips = pd.dataframe({ 'start_station': ['a', 'c'], 'end_station': ['b', 'a'], }) trips.apply(pd.value_counts).fillna(0)

the result is:

end_station start_station 1 1 b 1 0 c 0 1

python pandas

Breedlove

Friday 15 March 2013

python - expand each row to multiple rows in pandas using dataframe.apply (similar to MapReduce) -

No comments:

Post a Comment