How to replace column values in pandas DataFrame based on column conditions

Renesh Bedre    4 minute read

While working on pandas DataFrame, you encounter a problem where you need to replace or update the column values based on column conditions. In this article, we will discuss four methods for replacing the values of pandas Dataframe columns based on the column conditions.

1. .loc indexing

df.loc[df['col_name'] == 'old_value', 'col_to_replace'] = new_value

2. numpy.where() function

df['col_to_replace'] = np.where(df['col_name'] == 'old_value', 'true_value', 'false_value')

3. pandas mask() function

df['col_to_replace'].mask(df['col_to_replace'] == 'old_value', 'new_value', inplace=True)

4. pandas where() function

df['col_to_replace'].where(df['col_to_replace'] == 'old_value', 'new_value', inplace=True)

Now, we will discuss these four methods in detail with an example dataset,

1. .loc indexing

import pandas as pd

# create a random dataframe
df = pd.DataFrame({'name':['Adams', 'Jones', 'Frank', 'Smith', 'Davis'], 
                   'age':[25, 30, 28, 35, 22], 'weight':[74, 90, 85, 65, 92]})
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      90
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

# replace the Smith's weight to 80
df.loc[df['name'] == 'Smith', 'weight'] = 80
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      90
2  Frank   28      85
3  Smith   35      80
4  Davis   22      92

# based on multiple column conditions
# update the Adams weight to 70 if his age is 25
df.loc[(df['name'] == 'Adams') & (df['age'] == 25), 'weight'] = 70
# output
    name  age  weight
0  Adams   25      70
1  Jones   30      90
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

# replace the weight value to 75 if age is greater than 28
df.loc[df['age'] > 28, 'weight'] = 80
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      80
2  Frank   28      85
3  Smith   35      80
4  Davis   22      92

The pandas .loc indexing is a convenient way replace the column values based on a conditional expression. You can replace the column values based on single or multiple columns conditions.

2. numpy.where() function

import pandas as pd
import numpy as np

# create a random dataframe
df = pd.DataFrame({'name':['Adams', 'Jones', 'Frank', 'Smith', 'Davis'], 
                   'age':[25, 30, 28, 35, 22], 'weight':[74, 90, 85, 65, 92]})
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      90
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

# replace the Jones's age to 25
df['age'] = np.where(df['name'] == 'Jones', 25, 30)
# output
    name  age  weight
0  Adams   30      74
1  Jones   25      90
2  Frank   30      85
3  Smith   30      65
4  Davis   30      92

numpy.where() is a conditional function which returns the elements based on a condition. This method is more suitable if you want to update the large number of values based on condition in a column.

The syntax of this function is:

numpy.where(condition, true_value, false_value)

condition: conditional expression
true_value: Old value will be replaced with this true value if the condition is True
false_value: Old value will be replaced with this value if the condition is False

3. pandas mask() function

import pandas as pd

# create a random dataframe
df = pd.DataFrame({'name':['Adams', 'Jones', 'Frank', 'Smith', 'Davis'], 
                   'age':[25, 30, 28, 35, 22], 'weight':[74, 90, 85, 65, 92]})
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      90
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

# replace the weight value with 92 if it is 90
df['weight'].mask(df['weight'] == 90, 98, inplace=True)

# output
   name  age  weight
0  Adams   25      74
1  Jones   30      98
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

pandas.DataFrame.mask() is also a conditional function which replaces the value if the condition is True. The pandas mask() is opposite to that of numpy.where() function. If the condition is False, it does keep the original value.

The syntax of pandas mass function is:

DataFrame['col_to_replace'].mask(condition, new_value)

condition: conditional expression
col_to_replace: Name of the column in which values need to be replaced
new_value: Old value will be replaced with this value if the condition is True

4. pandas where() function

import pandas as pd

# create a random dataframe
df = pd.DataFrame({'name':['Adams', 'Jones', 'Frank', 'Smith', 'Davis'], 'age':[25, 30, 28, 35, 22], 'weight':[74, 90, 85, 65, 92]})
# output
    name  age  weight
0  Adams   25      74
1  Jones   30      90
2  Frank   28      85
3  Smith   35      65
4  Davis   22      92

# replaces the value if the condition is False
df['name'].where(df['name'] == 'Frank', 'Jones', inplace=True)

# output
    name  age  weight
0  Jones   25      74
1  Jones   30      90
2  Frank   28      85
3  Jones   35      65
4  Jones   22      92

pandas.DataFrame.where() is also a conditional function which replaces the value if the condition is False (as opposite to mask() function). If the condition is True, it does keep the original value.

The syntax of pandas where function is:

DataFrame['col_to_replace'].where(condition, new_value)

condition: conditional expression
col_to_replace: Name of the column in which values need to be replaced
new_value: Old value will be replaced with this value if condition is False. Check value will remain same if it matches.

Enhance your skills with courses Python and pandas

If you have any questions, comments or recommendations, please email me at reneshbe@gmail.com


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.