To run a multiple linear regression in Python, we can use the function `OLS.from_formula()`

from `statsmodels.api`

. For example, if we want to run a regression to predict `score`

using `hours_studied`

and `breakfast`

(contained in a dataset named `survey`

), we can fit the model as follows:

import statsmodels.api as sm model = sm.OLS.from_formula('score ~ hours_studied + breakfast', data=survey).fit()

To actually view the results, we can print a summary of them to the console using the following code.

print(model.summary())

Rather than printing the entire summary table, we can call the model coefficients directly using `model.params`

. We can even call a specific coefficient by order of appearance in the table. For instance:

print(model.params) # Output: # Intercept 32.665570 # hours_studied 8.540499 # breakfast 22.495615 print(model.params[0]) # Output: # 32.66556979549575

From the coefficient table, we can see the intercept is approximately 32.7, the coefficient on `hours_studied`

is 8.5, and the coefficient on `breakfast`

is 22.5.

### Instructions

**1.**

Using the `student`

dataset, fit a multiple regression model for the response variable `port3`

with quantitative predictor `math1`

and binary predictor `address`

. Save the results as `model1`

.

**2.**

Print the intercept and coefficients from `model1`

using `.params`

. Are they listed in the order you thought they’d be?

**3.**

Using `model1.params`

, save the intercept as `b0`

, the coefficient for `math1`

as `b1`

, and the coefficient for `address`

as `b2`

. If we added students’ first semester Portuguese score (`port1`

) as another predictor to the model, what index would it be in `model1.params`

?

# Take this course for free

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.