Curriculum
Course: Data Science
Login

Curriculum

Data Science

Text lesson

DS Linear Regression Case

Case: Use Duration + Average_Pulse to Predict Calorie_Burnage

Create a Linear Regression Table using Average Pulse and Duration as explanatory variables.

Example

import pandas as pd
import statsmodels.formula.api as smf

full_health_data = pd.read_csv(“data.csv”, header=0, sep=“,”)

model = smf.ols(‘Calorie_Burnage ~ Average_Pulse + Duration’, data = full_health_data)
results = model.fit()
print(results.summary())

Example Explained:

  • Import the library statsmodels.formula.api as smf, which is a statistical library in Python.
  • Use the full_health_data dataset.
  • Create a model based on Ordinary Least Squares using smf.ols(). Remember, the explanatory variable should be listed first in the parentheses.
  • Use the full_health_data dataset.
  • By calling .fit(), you obtain the results variable, which contains detailed information about the regression model.
  • Finally, call .summary() to display the linear regression results table.

img_lr_table_case

The linear regression function can be expressed mathematically as:

Calorie Burnage = Average Pulse * 3.1695 + Duration * 5.8424 – 334.5194

Rounded to two decimal places:

Calorie Burnage = Average Pulse * 3.17 + Duration * 5.84 – 334.52