Joining Data on Pandas: Dataframe Merge Examples

Last updated:

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

Note that using pd.merge(left_df,right_df) can also be written as left_df.merge(right_df)

Simple join using pd.merge()

The default is an inner join. Use 'on'='left'|'right'|'outer' to change join types.

pd.merge(
    <left_daframe>, 
    <right_dataframe>, 
    left_on= <column_on_left_dataframe>, 
    right_on= <column_on_right_dataframe>)

Join on indices

If the columns you want to join on are Indices, use left_index and right_index:

pd.merge(
    <left_daframe>, 
    <right_dataframe>, 
    how= <'inner','left','right'>, 
    left_index=True, 
    right_index=True)

Join on multiple columns

TODO

pd.merge vs dataframe.join

TODO

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.join.html

Approximate match join

I.e. join a row on left_df with a row on right_df where column column_right on right_df is the closest to column_left on left_df, according to a given distance function.

TODO

https://stackoverflow.com/questions/13636848/is-it-possible-to-do-fuzzy-match-merge-with-python-pandas

https://stackoverflow.com/questions/34880539/pandas-merging-based-on-a-timestamp-which-do-not-match-exactly

  • This looks very promising

    https://pandas.pydata.org/pandas-docs/stable/generated/pandas.merge_asof.html

  • Maybe I could use this to snap the dates to the nearest month and then do an exact match:

    https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DatetimeIndex.snap.html

  • merge_closest

    https://github.com/RobinFiveWords/merge_closest

  • np.piecewise

    https://docs.scipy.org/doc/numpy/reference/generated/numpy.piecewise.html

  • IntervalIndex

    https://stackoverflow.com/a/46526249/436721

Merge on closest date

TODO a particular case of the above?

Merge with date range

TODO a particular case of the above?

Merge by year/month/day of the week

TODO a particular case of the above?

References

Dialogue & Discussion