arrow left
Back to Developer Education

Getting Started with Web Automation using Selenium with Python

Getting Started with Web Automation using Selenium with Python

Automation can be seen as a process of removing human effort in a process that uses electronic machines or robots to perform tasks.

In this article, we will be looking at automating web processes.

The ability to enable software robots to automatically perform processes and tasks on the web is known as web automation.

Using web automation we can do a lot of things, For example:

  • Search the web.
  • Delete emails.
  • Fill forms.
  • Log into websites.

The need for speed in performing repetitive tasks is a necessity in the modern world, this makes automation necessary.

Selenium is a framework used for web application testing, automating software tests, and scraping the web.

In python, selenium can be seen as a set of libraries that helps developers interact with the web to enable the automation of web processes.

Selenium is a very powerful tool when it comes to interacting with web browsers, it supports all modern web browsers and can be coded in various programming languages such as Java, Python, C#, and so on.

In this guide, we will be looking at how to use selenium to write scripts that will automate basic web tasks using Python.

Prerequisite

To understand this guide, the reader must be familiar with:

  • HTML tags, elements, IDs, and classes.
  • Basics of Python programming language

Goal

In this guide, we will be focusing on building two python automation scripts.

One will perform a Google search based on the keyword "University", and the other will automatically log in to Quora.

At the end of this guide the reader will be able to write python scripts that can:

  • Find elements in the browser.
  • Insert text to form fields in the browser.
  • Click buttons in the browser.

The expected result would be:

demo

Setting up the environment

First, we will need to create a virtual environment in Python. Click here to learn how to create a virtual environment.

To work with selenium, we will have to install selenium. To install, use the following command:

pip install selenium

We also have to install a web driver (a tool that is needed for web automation). The web driver helps us interact with the browser.

If you are using Windows, we will be using a windows package manager known as chocolatey to install the web driver.

Click here to install chocolatey.

To install, we will use the command below:

choco install chromedriver

If you are using macOS, we will use the command below:

brew cask install chromedriver

The version of chromedriver should be compatible with your browser version.

If you encounter a compatibility error, then download the driver based on your browser version from here.

Create a file app.py and add the code below:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.google.com/')

The above code snippet is used to open a browser and request a web url.

The first line of code imports the web driver from Selenium. The second line opens the chrome web driver driver.

NOTE: There are different web drivers for different browsers. If you prefer to use a different browser, browse for the driver's name on the internet.

For instance, we would use firefoxDriver for the Firefox browser.

On the third line, we use driver to send a request to the browser requesting the url.

You can run the code using the command below:

python app.py

The above code opens up the Chrome browser as shown in the image below:

browser open image

Next, we will be entering a search keyword into the search field of the Google website. To do that, we will have to get the search field element by inspecting the page.

To inspect the page, right-click on the Google website page and click on Inspect element.

The browser will open a window as shown in the image below:

inspect browser

Before we continue, we will need to understand what locators in selenium are.

Locators are ways we can identify web elements on the web page with. They help us find any element on the webpage.

There are different types of locators we can use to identify elements on a web page. They include - id, class, name, and xpath.

We use them as shown below:

  • find_element_by_id.
  • find_element_by_name.
  • find_element_by_className.
  • find_element_by_xpath.

From the above id, name, and className are HTML attributes used inside HTML tags to control their behavior.

xpath stands for extensible markup language path (XML path) is a syntax for finding elements on a webpage.

To get the element, hover on the div tags and keep opening the one that highlights the search bar inclusive until you find the one that highlights only the search field.

Then, right-click the tag, click on copy xpath. Next, paste the xpath as shown:

searchField = driver.find_element_by_xpath('/html/body/div[1]/div[3]/form/div[1]/div[1]/div[1]/div/div[2]/input')
searchField.send_keys('university')

searchField.submit()

From the above code snippet:

  • We initialized the variable searchField with the xpath value that we copied.
  • send_keys() is used to insert the text in the searchField object.
  • searchField.send_keys('university') inserts the value university into the search box.
  • searchField.submit() submit the search request.

NOTE: You can also search for the submit button if the web page has such an element and use the click() method on it. But, the submit() method makes it easier.

Your complete code will look like the snippet below:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.google.com/')

searchField = driver.find_element_by_xpath('/html/body/div[1]/div[3]/form/div[1]/div[1]/div[1]/div/div[2]/input')
searchField.send_keys('university')

searchField.submit()

If you run your code, it will open up the browser, request for the Google webpage, input the university value in the search box, and submits it automatically.

Automate logging into a website

Using what we have learned from the previous example let us try to log in to the Quora website. To do that let us create a new file inside our project directory with the name main.py. Paste or type in the below code into the file.

from selenium import webdriver

driver = webdriver.Chrome()

driver.get('https://www.quora.com/') # Open Quora website

emailField = driver.find_element_by_xpath('//*[@id="email"]') # HTML tag element for email field
emailField.send_keys('YourEmail') # Login user name

passwordField = driver.find_element_by_xpath('//*[@id="password"]') # HTML tag element for password field
passwordField.send_keys('YourPassword') # Login password

button = driver.find_element_by_xpath('//*[@id="root"]/div[2]/div/div/div/div/div/div[2]/div[2]/div[4]/button/div/div/div') # HTML tag element for button

button.click() # onClick event handler for HTML button

From the code snippet above:

  • First, we import webdriver from selenium.
  • To avoid multiple usages of webdriver.Chrome(), we store them in a variable driver.
  • driver.get('https://www.quora.com/') sends a request to Quora.
  • emailField = driver.find_element_by_xpath('//*[@id="email"]') finds the email field by xpath.
  • emailField.send_keys('YourEmail') inserts the email address into the email field.
  • passwordField = driver.find_element_by_xpath('//*[@id="password"]') finds the password field by xpath.
  • passwordField.send_keys('YourPassword') inserts the password into the password field.
  • button = driver.find_element_by_xpath('//*[@id="root"]/div[2]/div/div/div/div/div/div[2]/div[2]/div[4]/button/div/div/div'), finds the login button by xpath.
  • button.click() clicks the login button.

When you run the app, the Chrome browser opens, sends a request to the Quora website, fills in the login details, and logs you into your Quora account.

Conclusion

In conclusion, we were able to write two Python scripts that perform a Google search and logins to Quora.

Understanding the two examples above will give you an understanding of how you can use selenium to:

  • Direct to any URL.
  • Find any HTML element.
  • Fill and submit any form.

You can check out the full code here.

Further reading


Peer Review Contributions by: Srishilesh P S

Published on: Sep 4, 2021
Updated on: Jul 12, 2024
CTA

Cloudzilla is FREE for React and Node.js projects

Deploy GitHub projects across every major cloud in under 3 minutes. No credit card required.
Get Started for Free