We notice that the non-numeric values (9,000 and “AF”) are in the same rows as the missing values.
Solution: We can remove the rows with missing observations to resolve this issue.
When loading a dataset with Pandas, all blank cells are automatically converted into “NaN” values.
By removing the NaN values, we obtain a clean dataset suitable for analysis.
We can use the dropna()
function to remove the NaN values. axis=0
specifies that we want to remove all rows containing a NaN value.
health_data.dropna(axis=0,inplace=True) print(health_data) |
The result is a dataset with all NaN rows removed.