Pandas Fillna Examples: Filling in Missing Data

Last updated: 29 May 2023

Table of Contents

Fill with value
Specific column
Specific row
Multiple columns
Fill inplace
Fill with another column
Fill with another dataframe
Fill values matching condtion
Null-like values
Fill other values

Fill with value

This is the simplest way to use fillna, just call df.fillna(<value-to-fill>)

Example: Fill with 0:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': [1,     2,    3,     4,      5    ],
    'col3': ['foo', None, 'baz', 'quux', 'bax']
})

df.fillna(0)

BEFORE: dataframe has null-ish values
like np.nan and None

AFTER: null-ish values replaced by zero but note
that column types are unchanged

Specific column

Pass a dict of the form {col_name: value_to_fill} to fillna():

Example: Fill nulls with "-" in col1 only:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': [1,     2,    3,     4,      5    ],
    'col3': ['foo', None, 'baz', 'quux', 'bax']
})

df.fillna({"col1": "-"})

BEFORE: dataframe has null-ish values
like np.nan and None

AFTER: only col1 has had its null-like
values replaced by "-". Other columns are unchanged

Specific row

Use df.loc[<row_index>, df.columns] and map instead:

Example: Fill all null-like cells in row 1 with "--"

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   np.nan, 3.0,   np.nan, None ],
    'col2': [1,     2,      3,     4,      5    ],
    'col3': ['foo', None,  'baz', 'quux', 'bax']
})

index_to_fill = 1
value_to_fill = "--"

df.loc[index_to_fill,df.columns] = df.loc[index_to_fill,:].map(lambda value: value_to_fill if pd.isna(value) else value)
df

BEFORE: original dataframe with several null-like values

AFTER: only values in the row indexed by 1
were filled with "--"
(other rows left unchanged)

Multiple columns

To fill nulls in multiple specific columns, pass a dict to fillna

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': [1,     2,    3,     4,      5    ],
    'col3': ['foo', None, 'baz', 'quux', 'bax'],
    'col4': ['xxx', None, None,  None,   'xxx'],
})

df.fillna({"col1": 0, "col3": "--"})

BEFORE: several columns have null-like values

after-fillna-applied-to-some-columns-only

AFTER: only col1 and col3 have had
their nulls filled with 0 and "--",
respectively

Fill inplace

Just pass inplace=True to make fillna() return None instead of the modified dataframe:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': [1,     2,    3,     4,      5    ],
    'col3': ['foo', None, 'baz', 'quux', 'bax']
})


# returns nothing
df.fillna(0, inplace=True)

Fill with another column

This is sometimes called coalesce in other tools such as Apache Spark.

Use a dict and pass the column you want as the values:

Example: Fill nulls in col1 using values in col2

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': [1,     2,    3,     4,      5    ],
    'col3': ['foo', None, 'baz', 'quux', 'bax'],
})

df.fillna({'col1': df['col2']})

BEFORE: original dataframe with some null values

AFTER: every missing value in col1 was filled with the
equivalent value in col2
This is sometimes referred to as coalescing in
tools such as Apache Spark

Fill with another dataframe

If you have another dataframe to serve as the replacement values, you can use that by calling df.fillna(another_df):

Example: Fill null-like values in df with the matching values in replacement_df

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1.0,   2.0,  3.0,   np.nan, None ],
    'col2': ['foo', None, 'baz', 'quux', 'bax'],
})

replacement_df = pd.DataFrame({
    'col1': [0.0,  0.0,  0.0,  0.0,  0.0 ],
    'col2': ['--', '--', '--', '--', '--'],
})

df.fillna(replacement_df)

df is the base dataframe

replacement_df is the dataframe with the values to be used
(note that the column names and the index
match the ones in the base dataframe!)

AFTER: null-like values in df have been replaced with the
matching values in replacement_df

Fill values matching condtion

By default, fillna will fill null-like values, which inlude np.nan, None, etc.

To fill or replace other values, use df.where(<func>, <value_to_replace>) instead:

Example: replace all odd numbers with a string "ODD":

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'col1': [1,  2,  3,  4],
    'col2': [11, 12, 13, 14],
})

# note that the function must evaluate to FALSE
# to trigger the replacement
df.where(lambda value: value %2 == 0, "ODD")

BEFORE: even and odd numbers

AFTER: use where to pass a function that returns a boolean
value signalling which values should be replaced
by the provided value

Null-like values

These are all values the fillna function considers to be "null-like":

Value
`pandas.NaT`
`pandas.NA`
`numpy.nan`
`None`

Fill other values

One way to do this is to use df.replace to turn the values you wish to consider as null-like into np.nan or something like that and then use fillna:

Example: consider the string "-" as null-like and use fillna

import pandas as pd
import numpy as np


df = pd.DataFrame({
    'col1': [np.nan, "-",   3.0,  np.nan, None ],
    'col2': [np.nan, "foo", 'baz', pd.NaT, "-"],
})

df = df.replace({"-":np.nan})

df.fillna("XXX")

BEFORE: in addition to normal null-like values
like np.nan and None, we would
also like to consider "-" as a null-like value

AFTER: both the normal null-like values and the
cells that had "-" had their values
replaced with "XXX"

References

pandas documentation: DataFrame.fillna

Felipe 16 Oct 2022 29 May 2023 pandas