Web Scraping with Beautiful Soup
Lesson 1 of 1
  1. 1
    Before we get started, a quick note on prerequisites: This course requires knowledge of Python . Also some understanding of the Python library Pandas will be helpful later on in the lesson, …
  2. 2
    When we scrape websites, we have to make sure we are following some guidelines so that we are treating the websites and their owners with respect. Always check a website’s Terms and Conditions bef…
  3. 3
    In order to get the HTML of the website, we need to make a request to get the content of the webpage. To learn more about requests in a general sense, you can check out this article . Python ha…
  4. 4
    When we printed out all of that HTML from our request, it seemed pretty long and messy. How could we pull out the relevant information from that long string? BeautifulSoup is a Python library that…
  5. 5
    BeautifulSoup breaks the HTML page into several types of objects. #### Tags A Tag corresponds to an HTML Tag in the original document. These lines of code: soup = BeautifulSoup(‘ An example di…
  6. 6
    To navigate through a tree, we can call the tag names themselves. Imagine we have an HTML page that looks like this: World’s Best Chocolate Chip Cookies Ingredients 1 cup flour …
  7. 7
    When we’re telling our Python script what HTML tags to grab, we need to know the structure of the website and what we’re looking for. Many browsers, including Chrome, Firefox, and Safari, have Dev…
  8. 8
    If we want to find all of the occurrences of a tag, instead of just the first one, we can use .find_all(). This function can take in just the name of a tag and returns a list of all occurrences o…
  9. 9
    Another way to capture your desired elements with the soup object is to use CSS selectors. The .select() method will take in all of the CSS selectors you normally use in a .css file! Search Resul…
  10. 10
    When we use BeautifulSoup to select HTML elements, we often want to grab the text inside of the element, so that we can analyze it. We can use .get_text() to retrieve the text inside of whatever ta…
  11. 11
    Amazing! Now you know the basics of how to use BeautifulSoup to turn websites into data. If you take our Data Visualization or Data Manipulation courses, you can see how you might analyze this …

What you'll create

Portfolio projects that showcase your new skills

Pro Logo

How you'll master it

Stress-test your knowledge with quizzes that help commit syntax to memory

Pro Logo