Learn

In a multiple regression model, the coefficient on a quantitative predictor is the expected difference in the outcome variable for a one-unit increase of the predictor, holding all other predictors constant.

For the survey dataset, the multiple regression equation is:

score=16.7+6.3hours_studied+4.7assignments\text{score} = 16.7 + 6.3*\text{hours\_studied} + 4.7*\text{assignments}

The predictor assignments is a quantitative variable. Let’s substitute a few different values for assignments into the regression equation to see how it changes:

  • For students who completed 0 assignments:
score=16.7+6.3hours_studied+4.70score=16.7+6.3hours_studied\begin{aligned} \text{score} = 16.7 + 6.3*\text{hours\_studied} + 4.7*\bf{0}\\ \text{score} = 16.7 + 6.3*\text{hours\_studied}\\ \end{aligned}
  • For students who completed 1 assignment:
score=16.7+6.3hours_studied+4.71score=21.4+6.3hours_studied\begin{aligned} \text{score} = 16.7 + 6.3*\text{hours\_studied} + 4.7*\bf{1}\\ \text{score} = 21.4 + 6.3*\text{hours\_studied}\\ \end{aligned}
  • For students who completed 2 assignments:
score=16.7+6.3hours_studied+4.72score=26.1+6.3hours_studied\begin{aligned} \text{score} = 16.7 + 6.3*\text{hours\_studied} + 4.7*\bf{2}\\ \text{score} = 26.1 + 6.3*\text{hours\_studied}\\ \end{aligned}

The only difference between the equations is that we add 4.7 points to the intercept for each additional completed assignment. Thus, among students who studied the same number of hours (i.e., holding all other variables constant), students who completed one more assignment earned a 4.7 point higher test score on average.

Instructions

1.

Suppose that we ran a model predicting Portuguese score (port3) based on first math score (math1) and first Portuguese score (port1). The code and coefficients are shown below:

import statsmodels.api as sm model2 = sm.OLS.from_formula('port3 ~ math1 + port1', data=student).fit() print(model2.params) # Output: # Intercept 0.440159 # math1 0.111161 # port1 0.860927

In the file interpretations.txt write a one-sentence interpretation for the intercept. What does this value represent in the context of the dataset?

2.

In interpretations.txt, add a one-sentence interpretation for the coefficient on math1 and another for the coefficient on port1. Check the sample solutions in the file solutions.txt and compare them to your own.

3.

In interpretations.txt, add a one-sentence interpretation for the coefficient on port1 and another for the coefficient on port1. Check the sample solutions in the file solutions.txt and compare them to your own. Why is it important to hold the other predictors constant in our interpretation of a coefficient?

Take this course for free

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Already have an account?