Selenium Introduction
The purpose of these series is to use the different functionalities in the selenium documentation. Each functionality
will have a different example in order for us to better understand when and where to use that functionality.
Background
Selenium is usually used for automated testing but can also be used for scraping websites. Since most modern
websites are created as
Single-Page Applications
(SPA), the page is lazy loaded. It means that the basic structure like
HTML and
CSS are loaded
first before the data were loaded. The data were loaded afterwards through API calls and
Javascript. Scrapers
such as requests library from Python (tutorials can be found
here)
or guzzle from PHP cannot directly interact with javascripts. There can be workarounds to handle those javascript
interactions but sometimes it is a dead end. Selenium solves this kind of problems by interacting to the website using
browser so it is just like a person controlling the website. Selenium is mostly used for testing website and also
can be user for scraping.
Getting the Softwares required for Selenium
We will now start to code. But first lets make sure that we have the required.
- Install python. You can download python here depending on
your OS. The installation of python will depend on the OS that you are using
- Install the required library, in this case requests.You can run the command.
pip install selenium
or
python -m pip install selenium
- (Optional) You can download pycharm here as to make your coding
faster. I
will be using pycharm in doing these tutorials but you can also use notepad and command lines.
-
Download the browser drivers and paste it to the directory where it is accessible to the app. you can it
paste the library later. I will be using firefox for most of the tutorials as the geckodriver and firefox
are compatible even when the browser were updated.
-
(Optional) Install the chrome and firefox browsers.
Coding My First Selenium Program
- Create a directory where you will put your codes
- Copy the drivers that you have downloaded and paste them in the directory you've created.
- Create a file seleniumintro.py and paste the following codes
from selenium import webdriver
import time
The codes above imports the required library that we will use.
- Add this line
driver = webdriver.Firefox(executable_path="geckodriver.exe")
The code above will create a webdriver instance for Firefox
- Add this line
driver.get("https://slackingslacker.github.io/seleniumindex")
The line will got to the website (https://slackingslacker.github.io/seleniumindex).
- Add this line
time.sleep(5)
We will pause the program for 5 seconds. This is to ensure that you can see whats happening in the browser.
- Add this line
driver.close()
The line will close the webdriver as well as the browser.
- Add this line
driver = webdriver.Chrome(executable_path="chromedriver.exe")
This time we are going to use chrome browser to access the website.
- Add this line
driver.get("https://slackingslacker.github.io/seleniumindex")
The line will got to the website (https://slackingslacker.github.io/seleniumindex) on the chrome.
- Add this line
time.sleep(5)
We will again pause the program for 5 seconds.
- Add this line
driver.close()
The line will close the webdriver as well as the browser.
- Run the seleniumsimple.py.
python seleniumintro.py
or
Run on pycharm
It should do the following:
- Opens the firefox browser.(assuming you have firefox installed.)
- Browser goes to the website https://slackingslacker.github.io/seleniumindex
- Halts for 5 seconds
- Close firefox browser
- Opens the chrome browser (assuming you have chrome installed.)
- Browser goes to the website https://slackingslacker.github.io/seleniumindex
- Wait for 5 seconds
- Close chrome browser
Final Selenium Code
from selenium import webdriver
import time
driver = webdriver.Firefox(executable_path="geckodriver.exe")
driver.get("https://slackingslacker.github.io/seleniumindex")
time.sleep(5)
driver.close()
driver = webdriver.Chrome(executable_path="chromedriver.exe")
driver.get("https://slackingslacker.github.io/seleniumindex")
time.sleep(5)
driver.close()
Conclusion
Using selenium, we can open the browser and automatically use the functionalities of a website. With just a few lines
of codes, we can easily use selenium.
No comments:
Post a Comment