Pandas Time Series Examples: DatetimeIndex, PeriodIndex and TimedeltaIndex

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

View all code in this jupyter notebook

For more examples on how to manipulate date and time values in pandas dataframes, see Pandas Dataframe Examples: Manipulating Date and Time

Use existing date column as index

If your dataframe already has a date column, you can use use it as an index, of type DatetimeIndex:

import pandas as pd

# this is the original dataframe
df = pd.DataFrame({
    'name':[
        'john','mary','peter','jeff','bill'
    ],
    'date_of_birth':[
        '2000-01-01', '1999-12-20', '2000-11-01', '1995-02-25', '1992-06-30',
    ],
})

print(df.index)
# RangeIndex(start=0, stop=5, step=1)

# convert the column (it's a string) to datetime type
datetime_series = pd.to_datetime(df['date_of_birth'])

# create datetime index passing the datetime series
datetime_index = pd.DatetimeIndex(datetime_series.values)

df2=df.set_index(datetime_index)

# we don't need the column anymore
df2.drop('date_of_birth',axis=1,inplace=True)

print(df2.index)
# DatetimeIndex(['2000-01-01', '1999-12-20', '2000-11-01', '1995-02-25',
#    '1992-06-30'], dtype='datetime64[ns]', freq=None)

original-dataframe-has-a-range-index BEFORE: If you don't specify an
index when creating a dataframe,
by default it's a RangeIndex
         
dataframe-now-has-datetimeindex AFTER: After setting the index to
the date column, the index is now
of type DatetimeIndex

Add row for empty periods

View all offset aliases here

import pandas as pd

df = pd.DataFrame({
    'name':[
        'john','mary','peter','jeff','bill'
    ],
    'year_born':[
        '2000', '1999', '2000', '1995', '1992',
    ],
})

df.index
# RangeIndex(start=0, stop=5, step=1)

# build a datetime index from the date column
datetime_series = pd.to_datetime(df['year_born'])
datetime_index = pd.DatetimeIndex(datetime_series.values)

# replace the original index with the new one
df3=df.set_index(datetime_index)

# we don't need the column anymore
df3.drop('year_born',axis=1,inplace=True)

# IMPORTANT! we can only add rows for missing periods
# if the dataframe is SORTED by the index
df3.sort_index(inplace=True)

df3.index
# DatetimeIndex(['1992-01-01', '1995-01-01', '1999-01-01', '2000-01-01',
#               '2001-01-01'],
#              dtype='datetime64[ns]', freq=None)

# 'YS' stands for 'YEAR START'
df4=df3.asfreq('YS')

df4.index
# DatetimeIndex(['1992-01-01', '1993-01-01', '1994-01-01', '1995-01-01',
#               '1996-01-01', '1997-01-01', '1998-01-01', '1999-01-01',
#               '2000-01-01', '2001-01-01'],
#              dtype='datetime64[ns]', freq='AS-JAN')

original-dataframe df: original dataframe
     
after-setting-datatime-index df3: after transforming the
date column into a DatetimeIndex.
Note that the years have
been converted to day-
based dates.
     
after-changing-frequency-and-filling-empty-rows df4: after calling asfreq(), extra
rows (in blue) have been
added for the missing periods.

Dialogue & Discussion