Learn

Introduction to Statistics with NumPy

Review

Let’s review! In this lesson, you learned how to use NumPy to analyze single-variable datasets. Here’s what we covered:

- Using the
`np.sort`

method to locate outliers. - Calculating central positions of a dataset using
`np.mean`

and`np.median`

. - Understanding the spread of our data using
**percentiles**and the**interquartile range**. - Finding the standard deviation of a dataset using
`np.std`

.

In our next lesson, we’ll continue our exploration of NumPy and see how we can use it to analyze different statistical distributions. Follow the checkpoints below to practice what you just learned!

A group of citizen scientists has been collecting data on rainfall in Seattle. They’ve presented their data to you in the form of monthly averages, measured in inches.

Month | Avg. Precipitation |
---|---|

January | 5.21 |

February | 3.76 |

March | 3.27 |

April | 2.35 |

May | 1.89 |

June | 1.55 |

July | 0.65 |

August | 1.06 |

September | 1.72 |

October | 3.36 |

November | 4.82 |

December | 5.11 |

We’ve saved this data to the NumPy array `rainfall`

.

Find the mean of the `rainfall`

array and save it to the variable `rain_mean`

.

Find the median of the `rainfall`

array and save it to the variable `rain_median`

.

Find the 25th and the 75th percentiles of the original `rainfall`

array and save them to the arrays `first_quarter`

and `third_quarter`

, respectively.

Calculate the interquartile range and save it to the variable, `interquartile_range`

.

Determine the standard deviation of the set and save it to the variable `rain_std`

.

Print the variables to the terminal to see your results.