Pandas DataFrame by Example: GroupBy Examples

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

For Dataframe usage examples not related to GroupBy, see Pandas Dataframe by Example

Concatenate strings in a group

This is called GROUP_CONCAT in databases such as MySQL

In the original dataframe, each row is a tag assignment:

user_id content_id tag
1 1 'cool'
2 1 'nice'
1 2 'clever'
3 2 'clever'
3 2 'not bad'
tag_assignments_df.groupby("content_id")['tag'].apply(lambda tags: ','.join(tags))

After the operation, we have one row per content_id and all tags are joined with ','.

content_id tag
1 'cool','nice'
2 'clever','clever','not bad'

Number of of unique column values per group

How many unique users have tagged each movie?

tag_assignments_df.groupby("movie_id")["user_id"].nunique().to_frame().reset_index()

References

Dialogue & Discussion