A common approach to replacing empty cells is by calculating the mean, median, or mode of the column. Pandas provides the mean()
, median()
, and mode()
methods to compute these values for a given column.
Calculate the mean and use it to replace any empty values.
import pandas as pd df = pd.read_csv(‘data.csv’) x = df[“Calories”].mean() df[“Calories”].fillna(x, inplace = True) |
The mean is the average value, calculated by dividing the sum of all values by the number of values. |
Calculate the median and replace any empty values with it.
import pandas as pd df = pd.read_csv(‘data.csv’) x = df[“Calories”].median() df[“Calories”].fillna(x, inplace = True) |
The median is the middle value after sorting all the values in ascending order. |
Calculate the mode and replace any empty values with it.
import pandas as pd df = pd.read_csv(‘data.csv’) x = df[“Calories”].mode()[0] df[“Calories”].fillna(x, inplace = True) |
The mode is the value that occurs most frequently in a dataset. |