The DataFrame object has a method called info()
, which provides additional details about the dataset.
Display information about the data:
print(df.info()) |
<class ‘pandas.core.frame.DataFrame’> RangeIndex: 169 entries, 0 to 168 Data columns (total 4 columns): # Column Non-Null Count Dtype — —— ————– —– 0 Duration 169 non-null int64 1 Pulse 169 non-null int64 2 Maxpulse 169 non-null int64 3 Calories 164 non-null float64 dtypes: float64(1), int64(3) memory usage: 5.4 KB None |
The result shows that there are 169 rows and 4 columns.
RangeIndex: 169 entries, 0 to 168 Data columns (total 4 columns): |
It also includes the name of each column along with its data type.
# Column Non-Null Count Dtype — —— ————– —– 0 Duration 169 non-null int64 1 Pulse 169 non-null int64 2 Maxpulse 169 non-null int64 3 Calories 164 non-null float64 |
The info()
method also provides the count of non-null values in each column. In our dataset, it shows that the “Calories” column has 164 non-null values out of 169, indicating that there are 5 rows with missing data in that column for some reason. Empty or null values can be problematic during data analysis, so it’s important to consider removing rows with missing values. This process is part of data cleaning, which will be covered in the upcoming chapters.