Curriculum
Course: Data Science
Login

Curriculum

Data Science

Text lesson

Stat Correlation Matrix

Correlation Matrix

A matrix is an array of numbers organized in rows and columns.

A correlation matrix is a table that displays the correlation coefficients between variables, with the variables listed in the first row and column.

img_corr_matrix_table

The table above uses data from the full health data set.

Observations:

  • Duration and Calorie_Burnage have a strong relationship, with a correlation coefficient of 0.89. This makes sense because the longer we train, the more calories we burn.
  • There is almost no linear relationship between Average_Pulse and Calorie_Burnage, with a correlation coefficient of 0.02.

Can we conclude that Average_Pulse does not affect Calorie_Burnage? Not yet. We will revisit this question later!

Correlation Matrix in Python

The corr() function in Python can be used to generate a correlation matrix, and the round() function helps to round the results to two decimal places.

Example

Corr_Matrix = round(full_health_data.corr(),2)
print(Corr_Matrix)

Output:

img_stat_matrix