How to use pandas groupby multiple columns?

252    Asked by DavidEDWARDS in Data Science , Asked on Feb 14, 2023

 I have a data frame which contains duplicates I'd like to combine based on 1 column (name). In half of the other columns I'd like to keep one value (as they should all be the same) whereas I'd like to sum the others.


I've tried the following code based on an answer I found here: Pandas merge column duplicate and sum value


df2 = df.groupby(['name']).agg({'address': 'first', 'cost': 'sum'}
The only issue is I have 100 columns, so would rather not list them all out. Is there a way to pass a tuple or list in the place of 'address' and 'cost' above? Something along the lines of
column_list = df.columns.values.tolist()
columns_first = tuple(column_list[0:68])
columns_sum = tuple(column_list[68:104])
Answered by David EDWARDS
To use pandas groupby multiple columns, you could perhaps generate the dictionary using a list comprehension style syntax. E.g.
df2 = df.groupby(['name']).agg({col: 'first' if i


Your Answer