Monday, April 27, 2020

Web Scraping Using Requests - Beautifulsoup Tag Class

Requests and Beautifulsoup Using Tag

We will use beautifulSoup to get the name of the tag and some attributes.
 
Before we proceed, please make sure you have read the first and second blogs on this series to do the prerequisites.
 

Lets start to code.

  1. Create a file simpletagattr.py and paste the following codes
    from bs4 import BeautifulSoup
    import requests
    url = "https://slackingslacker.github.io/simpletags.html"
    html_doc = requests.get(url).text
    soup = BeautifulSoup(html_doc, "html.parser")
    print(soup.find(id="hDiv").name)
    image_tag = soup.find(id="imgId")
    print(image_tag["src"])
    print(image_tag["width"])
    print(image_tag["height"])
    print(image_tag["alt"])
    
  2. Run the simpletagattr.py. It should print the text
    div
    simple.png
    200
    200
    Sample Image
    
 

What did we do?

  • We import the BeautifulSoup class from bs4 library.
    from bs4 import BeautifulSoup
    
  • We import the requests library.
    import requests
    
  • We declare the URL that we will use for scraping.
    url = "https://slackingslacker.github.io/simpletags.html"
    
  • We get the name value of the HTML in the 4th line assigning it to html_doc.
    html_doc = requests.get(url).text
    
  • We create a BeautifulSoup instance using the HTML text value in variable html_doc assigning to variable soup. We also use the html.parser to parsing the HTML.
    soup = BeautifulSoup(html_doc, "html.parser")
    
  • On the 6th line,
    print(soup.find(id="hDiv").name)
    
    we print the name of the Tag. This code gets the object Tag of whatever it finds using id hDiv
    soup.find(id="hDiv")
    
  • On the 7th line,
    image_tag = soup.find(id="imgId")
    
    we assign the Tag object to image_tag. We will use this object to print the element attributes.
  • On the succeeding lines, we print the src, width, height and alt attributes respectively
 

Conclusion

We use the Tag class to obtain different propeties of an HTML element.
 

No comments:

Post a Comment

Programming

Basic Web Scraping Using Python - A Beginner's Guide to using Requests and Selenium

Beginner Guide to Web Scraping Using Python For Requests and Selenium (Live Examples)   Web scraping is gathering da...