Missing cells can lead to inaccurate results when analyzing data.
One approach to handling empty cells is to remove rows that contain them. This is often acceptable, as datasets are typically large, and removing a few rows usually has minimal impact on the results.
Generate a new DataFrame excluding rows with empty cells:
import pandas as pd df = pd.read_csv(‘data.csv’) new_df = df.dropna() print(new_df.to_string()) |
Note: By default, the dropna() method creates a new DataFrame without modifying the original. |
To modify the original DataFrame, use the inplace=True
argument.
Delete all rows containing NULL values.
import pandas as pd df = pd.read_csv(‘data.csv’) df.dropna(inplace = True) print(df.to_string()) |
Note: With dropna(inplace=True) , no new DataFrame is returned; instead, rows with NULL values are removed from the original DataFrame. |
Another way to handle empty cells is by replacing them with a new value. This approach prevents the need to delete entire rows just due to a few empty cells. The fillna()
method lets you replace empty cells with a specified value.
Replace NULL values with the value 130:
import pandas as pd df = pd.read_csv(‘data.csv’) df.fillna(130, inplace = True) |
The example above replaces empty cells throughout the entire DataFrame. To replace empty values in just one column, specify the column name in the DataFrame.
Replace NULL values in the “Calories” column with the value 130:
import pandas as pd df = pd.read_csv(‘data.csv’) df[“Calories”].fillna(130, inplace = True) |