Key Concepts

Review core concepts you need to learn to master this subject


Seaborn is a Python data visualization library that builds off the functionalities of Matplotlib and integrates nicely with Pandas DataFrames. It provides a high-level interface to draw statistical graphs, and makes it easier to create complex visualizations.

Seaborn barplot

In Seaborn, drawing a barplot is simple using the function sns.barplot(). This function takes in the paramaters data, x, and y. It then plots a barplot using data as the dataframe, or dataset for the plot. x is the column of the dataframe that contains the labels for the x axis, and y is the column of the dataframe that contains the data to graph (aka what will end up on the y axis).

Using the Seaborn sample data “tips”, we can draw a barplot having the days of the week be the x axis labels, and the total_bill be the y axis values:

sns.barplot(data = tips, x = "day", y = "total_bill")

Seaborn function plots means by default

By default, the seaborn function sns.barplot() plots the means of each category on the x axis.

In the example code block, the barplot will show the mean satisfaction for every gender in the dataframe df.

Barplot error bars

By default, Seaborn’s barplot() function places error bars on the bar plot. Seaborn uses a bootstrapped confidence interval to calculate these error bars.

The confidence interval can be changed to standard deviation by setting the parameter ci = "sd".

Estimator argument in barplot

The estimator argument of the barplot() method in Seaborn can alter how the data is aggregated. By default, each bin of a barplot displays the mean value of a variable. Using the estimator argument this behaviour would be different.

The estimator argument can receive a function such as np.sum, len, np.median or other statistical function. This function can be used in combination with raw data such as a list of numbers and display in a barplot the desired statistic of this list.

Seaborn hue

For the Seaborn function sns.barplot(), the hue parameter can be used to create a bar plot with more than one dimension, or, in other words, such that the data can be divided into more than one set of columns.

Using the Seaborn sample data “tips”, we can draw a barplot with the days of the week as the labels of the columns on the x axis, and the total_bill as the y axis values as follows:

sns.barplot(data = tips, x = "day", y = "total_bill", hue = "sex")

As you can see, hue divides the data into two columns based on the “sex” - male and female.

Seaborn Package

Seaborn is a suitable package to plot variables and compare their distributions. With this package users can plot univariate and bivariate distributions among variables. It has superior capabilities than the popular methods of charts such as the barchart. Seaborn can show information about outliers, spread, lowest and highest points that otherwise would not be shown on a traditional barchart.

Box and Whisker Plots in Seaborn

A box and whisker plot shows a dataset’s median value, quartiles, and outliers. The box’s central line is the dataset’s median, the upper and lower lines marks the 1st and 3rd quartiles, and the “diamonds” shows the dataset’s outliers. With Seaborn, multiple data sets can be plotted as adjacent box and whisker plots for easier comparison.

Arrow Chevron Left Icon
Learn Seaborn Introduction
Lesson 1 of 2
Arrow Chevron Right Icon
  1. 1
    In this lesson, you’ll learn how to use Seaborn to create bar charts for statistical analysis. Seaborn is a Python data visualization library that provides simple code to create elegant visualizat…
  2. 2
    Throughout this lesson, you’ll use Seaborn to visualize a Pandas DataFrame. DataFrames contain data structured into rows and columns. DataFrames look similar to other data tables you may be famil…
  3. 3
    Take a look at the file called results.csv. You’ll plot that data soon, but before you plot it, take a minute to understand the context behind that data, which is based on a hypothetical situat…
  4. 4
    Seaborn can also calculate aggregate statistics for large datasets. To understand why this is helpful, we must first understand what an aggregate is. An aggregate statistic, or aggregate, is …
  5. 5
    Recall our gradebook from the previous exercise: |student|assignment_name|grade| |-|-|-| |Amy|Assignment 1|75| |Amy|Assignment 2|82| |Bob|Assignment 1|99| |Bob|Assignment 2| 90| |Chris|Assignm…
  6. 6
    By default, Seaborn will place error bars on each bar when you use the barplot() function. Error bars are the small lines that extend above and below the top of each bar. Errors bars visually in…
  7. 7
    In most cases, we’ll want to plot the mean of our data, but sometimes, we’ll want something different: * If our data has many outliers, we may want to plot the median. * If our data is categorica…
  8. 8
    Sometimes we’ll want to aggregate our data by multiple columns to visualize nested categorical variables. For example, consider our hospital survey data. The mean satisfaction seems to depend on…
  9. 9
    In this lesson you learned how to extend Matplotlib with Seaborn to create meaningful visualizations from data in DataFrames. You’ve also learned how Seaborn creates aggregated charts and how to c…
  1. 1
    In this lesson, we will explore how to use Seaborn to graph multiple statistical distributions, including box plots and violin plots. Seaborn is optimized to work with large datasets — from …
  2. 2
    Before we dive into these new charts, we need to understand why we’d want to use them. To best illustrate this idea, we need to revisit bar charts. We previously learned that Seaborn can quickly …
  3. 3
    Bar plots can tell us what the mean of our dataset is, but they don’t give us any hints as to the distribution of the dataset values. For all we know, the data could be clustered around the mean or…
  4. 4
    To plot a KDE in Seaborn, we use the method sns.kdeplot(). A KDE plot takes the following arguments: - data - the univariate dataset being visualized, like a Pandas DataFrame, Python list, or N…
  5. 5
    While a KDE plot can tell us about the shape of the data, it’s cumbersome to compare multiple KDE plots at once. They also can’t tell us other statistical information, like the values of outliers. …
  6. 6
    One advantage of the box plot over the KDE plot is that in Seaborn, it is easy to plot multiples and compare distributions. Let’s look again at our three datasets, and how they look plotted as bo…
  7. 7
    As we saw in the previous exercises, while it’s possible to plot multiple histograms, it is not a great option for comparing distributions. Seaborn gives us another option for comparing distributio…
  8. 8
    Violin Plots are a powerful graphing tool that allows you to compare multiple distributions at once. Let’s look at how our original three data sets look like as violin plots: sns.violinplot(data…
  9. 9
    In this lesson, we examined how Seaborn has several plots that can visualize distributions. While bar plots can display basic aggregates, KDE plots, dist plots, box plots, and violin plots can show…

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo