Automatically Get Top 10 Jobs from LinkedIn Using Python

Here we are going to use Clicknium to scrape LinkedIn top 10 jobs. First, we will login to LinkedIn to search the jobs according to the job keyword(the title, the skill, or the company) and the location, and then get the top 10 jobs in the search results. For each job, we will get the job information, such as the title, the company name, the size of the company, the post date, the job type, and the link URL. At last, we will save the results into CSV file.

The steps overview are as below:

  • Login to LinkedIn
  • Search jobs with the keyword and location
  • Scrape the information of the top 10 jobs
  • Save search results into csv file


1.1 Python modules

Clicknium python module provides methods to automate various types of applications in Windows, such as Web browser, Windows Desktop application, Java application and Sap windows GUI app, etc. In this sample, we also use pywin32 python module to get clipboard data, pywin32 python module provides access to many of the Windows APIs from Python.

 Install the python libraries with the following commands:

pip install clicknium
pip install pywin32

1.2 Clicknium Visual Studio Code Extension

Clicknium VS Code extension provides ways to install extension with the chosen browser, Clicknium use the browser extension to interact with the browser.  It also helps us get elements, edit elements or validate elements easier than before.  

Login to LinkedIn

2.1 Capturing Steps using clicknium VS Code extension

Besides writing Python source code to automate the login process and the job search as well as the storing of the data, we also need to capture the web elements on Chrome browser using the clicknium VS Code extension. To launch the extension, press Ctrl+Shift+P to open the command palette and type to select “clicknium capture”. This will open a new capture dialog and let the user record web elements using Ctrl+Click. After following the discussed steps as discussed below, click complete and execute the Python source code for clicknium.

Launch Clicknium Capture Dialog

2.2 In this section, we will scrape the related elements of the login page

login page

2.3 Open the browser with LinkedIn website, input the account username and password and then click the Sign in button


from clicknium import clicknium as cc, locator
# Create a browser instance with
# "", for edge browser using "cc.edge"
# Open browser with specified url and
# get browser tab For default, it will
# wait the page load completely. You do
# not need to add extra time.sleep()
_tab ="", is_wait_complete=True)
# Find input box for username
# Fill in with the key value 'linkedin_login_name'
# in setting.json
# Find input box for password
# Fill in with the key value 'linkedin_login_password'
# in setting.json
# Find submit button, and click it to login
# Wait skip add phone button appears in 5 seconds,
# if it exists, click the 'skip' button

Search jobs with the keyword and location

3.1 In this section, we will scrape the related elements of the job search page

job search page

3.2 Switch to the Jobs tab, fill out keyword and location of the job, and then click the Search button


# Wait the page load completely
# after submitting login information
# Find job channel and click it
# to switch to job channel
# Wait job search keyword input
# box exists in 10 seconds
# If exists fill in with the key
# value 'linkedin_search_job_key'
# in setting.json
# Find job search location input box
# Fill in with the key value
# 'linkedin_search_job_location' in setting.json
# Find the search button, and click
# it to search

Scrape the information of the top 10 jobs

4.1 In this section, we will scrape the elements below:

job detail information

4.2 Get the job item from the searching result list with parameter index


# Here we set range(1,11) to get top
# 10 jobs, it can be set with any value
for i in range(1, 11):
    # Wait the job item appears in 5 second,
    # and get the element with index value
    ele = _tab.wait_appear(, {
                           "index": i}, wait_timeout=5)

4.3 Get the title, the company name, the size of the company, the post date, the job type for each job item


# Initial job item search dict
details = {}
# Click job item
# Wait job item's title appears in 5 seconds
job_title_ele = _tab.wait_appear(, wait_timeout=5)
# If job item's title exists, get the title
# string and save into result object 'details'
if job_title_ele:
details["Job Title"] = job_title_ele.get_text().strip()
# Wait job item's company name appears in 5 seconds
job_company_ele = _tab.wait_appear(, wait_timeout=2)
# If job item's company name exists, get the company
# name string and save into result object 'details'
if job_company_ele:
    details["Company Name"] = job_company_ele.get_text().strip()
# Wait job item's company scale appears in 5 seconds
company_size_ele = _tab.wait_appear(, wait_timeout=2)
# If job item's company scale exists, get the
# company scale string and save into result
# object 'details'
if company_size_ele:
    scale = company_size_ele.get_text().strip(
    ) if "employees" in company_size_ele.get_text() else ""
    details["Company Size"] = scale
# Wait job item's post date appears in 5 seconds  
job_post_date_ele = _tab.wait_appear(, 
                                     wait_timeout = 2)
# If job item's post date exists, get 
# the post date string and save into 
# result object 'details'
if job_post_date_ele:
    post_date = job_post_date_ele.get_text().strip() \
    if "ago" in job_post_date_ele.get_text() else ""
    details["Post Date"] = post_date
# Wait job item's type appears in 5 seconds  
job_type_ele = _tab.wait_appear(,
                                wait_timeout = 2)
# If job item's type exists, get the type string
# and save into result object 'details'
if job_type_ele:
    details["Job Type"] = job_type_ele.get_text().strip()

4.4 Get job link 

4.4.1 Getting clipboard data with pywin32


# Library for win32 clipboard api
import win32clipboard
# Get clipboard data
def get_clipboard_data():
        # Call open clipboard api
        # Call get clipboard data api, and return the data
        data = win32clipboard.GetClipboardData()
        return data
        # If it got exception, return empty string
        return ""
        # Call close clipboard api

4.4.2 Click the Share button and Copy link button, then get data from clipboard 


# Wait job item's share button appears
# in 5 seconds
job_share_btn_ele = _tab.wait_appear(, wait_timeout=2)
# If job item's share button exists, click
# the share button
if job_share_btn_ele:
    # Wait the copy link button appears in 5 seconds
    copy_link = _tab.wait_appear(, wait_timeout=2)
    # If the copy link exists, click the copy
    # link to set clipboard data
    if copy_link:
        # Sleep 0.2 second to wait the clipboard 
        # in ready state
        # Get the job link string and save into 
        # result object 'details'
        details["Job Link"] = get_clipboard_data()

Save search results into csv file

5.1 Here is the content in result csv file:

CSV File of Saved Records

5.2 Use python built-in module csv to save data into csv file


# Library for csv operations api
import csv
# Save the list of dicts info csv file
def list_dict_to_csv(dicts, filename="test.csv"):
    # Open csv file and get file object
    with open(filename, 'w', newline='') as output_file:
        # Get csv header with the dicts keys
        keys = dicts[0].keys()
        # Initial DictWriter object
        dict_writer = csv.DictWriter(output_file, keys)
        # Write header into csv
        # Write row datas into csv

Below is the complete implementation



# Library for web automation apis
# Locator used for selector reference
from clicknium import clicknium as cc, locator
# Library for delay function
from time import sleep
# Library for save dict list data into csv file
from csvutils import list_dict_to_csv
# Library for clear clipboard and get clipboard data
from clipboard import get_clipboard_data, clear_clipboard_data
# Library for get setting in 'setting.json' file
from setting import Setting
# Login to LinkedIn page
# Find input box for username and password,
# and fill in with the value in setting.json
# Find submit button, and click it to login
# Wait 'skip add phone' button if it needs,
# and click the 'skip' button
def login():
    # Find input box for username
    # Fill in with the key value
    # 'linkedin_login_name' in setting.json
    # Find input box for password
    # Fill in with the key value
    # 'linkedin_login_password' in setting.json
    # Find submit button, and click it to login
    # Wait skip add phone button appears in 5
    # seconds, if it exists, click the 'skip' button
    _tab.wait_appear(, wait_timeout=5).click()
def search_jobs():
    # Wait the page load completely after 
    # submitting login information
    # Find job channel and click it to
    # switch to job channel
    # Wait job search keyword input box exists
    # in 10 seconds If exists fill in with
    # the key value 'linkedin_search_job_key' 
    # in setting.json
    # Find job search location input box
    # Fill in with the key value
    # 'linkedin_search_job_location' in setting.json
    # Find the search button, and click it to search
# Scrape the information of the top 10 jobs
# For each job item, get the title,
# the company name, the size of the company,
# the post date, the job type
# Save search results into csv file
def get_job_top10_list():
    # Initial search result list
    job_list = []
    # Clear clipboard data first
    # Here we set range(1,11) to get top 10 jobs,
    # it can be set with any value
    for i in range(1, 11):
        # Wait the job item appears in 5 second,
        # and get the element with index value
        ele = _tab.wait_appear(, {
                               "index": i}, wait_timeout=5)
        # If job item exists, click the job
        # item to get detail information
        if ele:
            # Initial job item search dict
            details = {}
            # Click job item
            # Wait job item's title appears in 5 seconds
            job_title_ele = _tab.wait_appear(
      , wait_timeout=5)
            # If job item's title exists, get
            # the title string and save into 
            # result object 'details'
            if job_title_ele:
                details["Job Title"] = job_title_ele.get_text().strip()
            # Wait job item's company name appears in 5 seconds
            job_company_ele = _tab.wait_appear(
      , wait_timeout=2)
            # If job item's company name exists
            #, get the company name string and
            # save into result object 'details'
            if job_company_ele:
                details["Company Name"] = job_company_ele.get_text().strip()
            # Wait job item's company scale appears in 5 seconds
            company_size_ele = _tab.wait_appear(
      , wait_timeout=2)
            # If job item's company scale exists,
            # get the company scale string and
            # save into result object 'details'
            if company_size_ele:
                scale = company_size_ele.get_text().strip(
                ) if "employees" in company_size_ele.get_text() else ""
                details["Company Size"] = scale
            # Wait job item's post date appears in 5 seconds
            job_post_date_ele = _tab.wait_appear(
      , wait_timeout=2)
            # If job item's post date exists,
            # get the post date string and save
            # into result object 'details'
            if job_post_date_ele:
                post_date = job_post_date_ele.get_text().strip(
                ) if "ago" in job_post_date_ele.get_text() else ""
                details["Post Date"] = post_date
            # Wait job item's type appears in 5 seconds
            job_type_ele = _tab.wait_appear(
      , wait_timeout=2)
            # If job item's type exists, get the
            # type string and save into result
            # object 'details'
            if job_type_ele:
                details["Job Type"] = job_type_ele.get_text().strip()
            # Wait job item's share button appears in 5 seconds
            job_share_btn_ele = _tab.wait_appear(
      , wait_timeout=2)
            # If job item's share button exists,
            # click the share button
            if job_share_btn_ele:
                # Wait the copy link button appears in 5 seconds
                copy_link = _tab.wait_appear(
          , wait_timeout=2)
                # If the copy link exists, click the copy
                # link to set clipboard data
                if copy_link:
                    # Sleep 0.2 second to wait the clipboard in ready state
                    # Get the job link string and save
                    # into result object 'details'
                    details["Job Link"] = get_clipboard_data()
            # Save job item's result to list object
    # If it has any results, save into the csv file,
    # set the file path with the key
    # value 'result_csv_file' in setting.json
    if job_list:
        list_dict_to_csv(job_list, Setting.result_csv_file)
if __name__ == "__main__":
    # Create a browser instance with "",
    # for edge browser using "cc.edge"
    # Open browser with specified url and get browser tab
    # For default, it will wait the page load
    # completely. You do not need to add extra time.sleep()
    _tab ="", is_wait_complete=True)
    # Check whether it needs to login in with username and password
    # True: means it needs to login in with username and password
    # False: means the website has remember authentication information
    if _tab.is_existing(
        # Login to LinkedIn
    # Search jobs with the keyword and location
    # Get top 10 jobs information from search
    # results and save into csv file



# Library for csv operations api
import csv
# Save the list of dicts info csv file
def list_dict_to_csv(dicts, filename="test.csv"):
    # Open csv file and get file object
    with open(filename, 'w', newline='') as output_file:
        # Get csv header with the dicts keys
        keys = dicts[0].keys()
        # Initial DictWriter object
        dict_writer = csv.DictWriter(output_file, keys)
        # Write header into csv
        # Write row datas into csv



# Library for win32 clipboard api
import win32clipboard
# Clear clipboard data
def clear_clipboard_data():
        # Call open clipboard api
        # Call empty clipboard api
        # Call close clipboard api
# Get clipboard data
def get_clipboard_data():
        # Call open clipboard api
        # Call get clipboard data api, and return the data
        data = win32clipboard.GetClipboardData()
        return data
        # If it got exception, return empty string
        return ""
        # Call close clipboard api



# Library for json operations api
import json
class Setting(object):
    # Open json file and get file object
    # Load json data
    with open("setting.json") as f:
        data = json.load(f)
    # Value set for LinkedIn login username
    login_name = data['linkedin_login_name']
    # Value set for LinkedIn login password
    login_password = data['linkedin_login_password']
    # Value set for LinkedIn job search keyword
    search_job_key = data['linkedin_search_job_key']
    # Value set for LinkedIn job search location
    search_job_location = data['linkedin_search_job_location']
    # Value set for csv file path to save search results
    result_csv_file = data['result_csv_file']

6.5 setting.json


    "linkedin_login_name": "your account username",
    "linkedin_login_password": "your account password",
    "linkedin_search_job_key": "your desired job title",
    "linkedin_search_job_location": "your desired job location",
    "result_csv_file": "C:\\test\\test.csv"

6.6 Output

Here is the video of the complete execution:

complete execution

