How to use pandas groupby multiple columns?
I have a data frame which contains duplicates I'd like to combine based on 1 column (name). In half of the other columns I'd like to keep one value (as they should all be the same) whereas I'd like to sum the others.
I've tried the following code based on an answer I found here: Pandas merge column duplicate and sum value
df2 = df.groupby(['name']).agg({'address': 'first', 'cost': 'sum'}
The only issue is I have 100 columns, so would rather not list them all out. Is there a way to pass a tuple or list in the place of 'address' and 'cost' above? Something along the lines of
column_list = df.columns.values.tolist()
columns_first = tuple(column_list[0:68])
columns_sum = tuple(column_list[68:104])
To use pandas groupby multiple columns, you could perhaps generate the dictionary using a list comprehension style syntax. E.g.
df2 = df.groupby(['name']).agg({col: 'first' if i