https://www.youtube.com/watch?v=aQU9aZ6h_DM&feature=youtu.be
Problem breakdown
Task 1: Login to Linkedin (using Selenium library)
- Task 1.1: Open Chrome browser and Access Linkedin site.
- Task 1.2: Key in the login credentials, and click the Login button
Task 2: Search for the profile we want to crawl (using Selenium library)
- Locate the search bar. Input the search query, and Search
Task 3: Open the URLs of the profiles (using Beautifulsoup4 library)
- Task 3.1: Write a function to extract the URLs on 1 page
- Task 3.2: Navigate through the next pages, and call that function to extract the profile of each page
Task 4: Scrape the data of 1 Linkedin profile (using Beautifulsoup4 library), and write the data to a .CSV file (using csv module)
- Task 4.1: Access all the URLs we extracted at Step 3
- Task 4.2: Write a function to access and scrape the data of 1 Linkedin profile
- Task 4.3: Write the output to a .CSV file
Extension use case (capstone project)
Sample 1: Write a script to Automatically connect with all the profile you crawled
Sample 2: Conduct an analysis to Cluster the persona and analyze the pattern of certain persona
Knowledge checkpoints