Basic Programming - Do It Simpler: Web Scraping Using Requests

Requests and Beautifulsoup Using Tag

We will use beautifulSoup to get the name of the tag and some attributes.

Before we proceed, please make sure you have read the first and second blogs on this series to do the prerequisites.

Lets start to code.

Create a file simpletagattr.py and paste the following codes

from bs4 import BeautifulSoup
import requests
url = "https://slackingslacker.github.io/simpletags.html"
html_doc = requests.get(url).text
soup = BeautifulSoup(html_doc, "html.parser")
print(soup.find(id="hDiv").name)
image_tag = soup.find(id="imgId")
print(image_tag["src"])
print(image_tag["width"])
print(image_tag["height"])
print(image_tag["alt"])

Run the simpletagattr.py. It should print the text
```
div
simple.png
200
200
Sample Image
```

What did we do?

We import the BeautifulSoup class from bs4 library.
```
from bs4 import BeautifulSoup
```
We import the requests library.
```
import requests
```

We declare the URL that we will use for scraping.

url = "https://slackingslacker.github.io/simpletags.html"

We get the name value of the HTML in the 4th line assigning it to html_doc.
```
html_doc = requests.get(url).text
```
We create a BeautifulSoup instance using the HTML text value in variable html_doc assigning to variable soup. We also use the html.parser to parsing the HTML.
```
soup = BeautifulSoup(html_doc, "html.parser")
```
On the 6th line,
```
print(soup.find(id="hDiv").name)
```
we print the name of the Tag. This code gets the object Tag of whatever it finds using id hDiv
```
soup.find(id="hDiv")
```
On the 7th line,
```
image_tag = soup.find(id="imgId")
```
we assign the Tag object to image_tag. We will use this object to print the element attributes.
On the succeeding lines, we print the src, width, height and alt attributes respectively

Conclusion

We use the Tag class to obtain different propeties of an HTML element.

Basic Programming - Do It Simpler

Monday, April 27, 2020

Web Scraping Using Requests - Beautifulsoup Tag Class

Requests and Beautifulsoup Using Tag

Lets start to code.

What did we do?

Conclusion

No comments:

Post a Comment

Programming

Basic Web Scraping Using Python - A Beginner's Guide to using Requests and Selenium