Report

How to create a LinkedIn job scraper in Python with Crawlee

Location

remote

About the job

Info This job is sourced from a job board

Overview

About the role

In this article, we will build a web application that scrapes LinkedIn for job postings using Crawlee and Streamlit. We will create a LinkedIn job scraper in Python using Crawlee for Python to extract the company name, job title, time of posting, and link to the job posting from dynamically received user input through the web application. NOTE One of our community members wrote this blog as a contribution to Crawlee Blog. If you want to contribute blogs like these to Crawlee Blog, please reach out to us on our discord channel. By the end of this tutorial, you'll have a fully functional web application that you can use to scrape job postings from LinkedIn. Let's begin. Prerequisites Let's start by creating a new Crawlee for Python project with this command: pipx run crawlee create linkedin-scraper Select PlaywrightCrawler in the terminal when Crawlee asks for it. After installation, Crawlee for Python will create boilerplate code for you. You can change the directory(cd) to the project folder and run this command to install dependencies. poetry install We are going to begin editing the files provided to us by Crawlee so we can build our scraper. NOTE Before going ahead if you like reading this blog, we would be really happy if you gave Crawlee for Python a star on GitHub!

About the company

Crawlee helps you build and maintain your crawlers. It's open source, but built by developers who scrape millions of pages every day for a living.

Skills

python

crawler

web scraping

streamlit