Intro to Requests Scraper
    We will write a simple scraper using python as our base language. We will create the very simplest web scraper.
 
    
What is Web Scraping?
    Web scraping is data extraction from a website.
 
    What to do next?
    We will now start to code. But first lets make sure that we have the required.
    
        - Install python. You can download python here depending on
            your
            OS. The installation of python will depend on the OS that you are using
        
 
        - Install the required library, in this case requests.You can run the command.
pip install requests
            or
python -m pip install requests
         
        - (Optional) You can download pycharm here as to make your coding
            faster. I
            will be using pycharm in doing these tutorials but you can also use notepad and command lines.
        
 
    
 
 
    Lets start to code.
    
        - Create a directory where you will put your codes
 
        - Create a file simple.py and paste the following codes
            
import requests
url = "https://slackingslacker.github.io/simple.html"
print(requests.get(url).text)
             
         
        - Run the simple.py.
python simple.py
            or
            Run on pycharm
            
             
            It should print the HTML
<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Tutorials</title>
</head>
<body>
    <div id="d">
        Inside the div tag but outside the p tag.
        <p id="p">This is inside the p tag.</p>
    </div>
</body>
</html>
         
    
 
 
 
    
Conclusion
    With just 3 lines of code we successfully scrape a website by getting its whole page.
 
    Just an Update yo our simple scraper. The following are sample on different HTTP Methods
    
        - Create a directory where you will put your codes
 
        - Create a file simple.py and paste the following codes
            
import requests
url = "https://slackingslacker.github.io/simple.html"
print(requests.get(url).text)
url = "http://slackingslacker.pythonanywhere.com"
# Using GET Method
print(requests.get(url+"/get").text)
# Using POST Method
print(requests.post(url+"/post").text)
# Using PUT Method
print(requests.put(url+"/put").text)
# Using DELETE Method
print(requests.delete(url+"/delete").text)
# Using POST Method With FORM Submission
print(requests.post(url+"/postdata",
                    data={"name":"slackingslacker",
                          "location": "earth",
                          "height": "normal human"}).text)
             
         
    
 
 
 
No comments:
Post a Comment