Objective

Understand what is data scraping, why it's useful in SaaS growth and how to implement a data scraping workflow.

Table of Content

Resources


<aside> 💡 This guide is meant to provide basic knowledge on a complex topic in the most concise manner. Never take anything as the gospel truth, read from multiple sources before acting.

</aside>

Introduction

📖 Definition

What is web scraping?


Definition

Web scraping is the process of massively extracting data from online websites (usually without the owner's authorisation).

It is a 5 step process:

  1. Load the target website
  2. Launch the scraping script
  3. Extract the data into a sheet (Google or Excel)
  4. Clean the data
  5. [Optional] Enrich the data with additional information

Technical details

When using a program for web scraping, it sends a GET request using the HTTP protocol to the target URL you mention. The web server, if it considers your request legitimate, will allow you to read the HTML content of the web page. You'll then store part of this content in your program environment.

Examples