All Courses

Python - WebForm Submission

Neha Kumawat

3 years ago

WebForm Submission in Python | Insideaiml
Table of Contents
  • Introduction
  • Example
  • Submitting a web form with requests

Introduction

          One of the most challenging tasks in web scraping is being able to log in automatically and extract data within your account on that website. In this tutorial, you will learn how you can extract all forms from web pages as well as filling and submitting them using requests_html  libraries.
Often some data is needed to be submitted to the server through the forms present in the HTML page for the interaction with a webpage. Processes like signing up for a new account or supplying some information like name or roll number to retrieve the result of an examination can be performed using this webform. The requests module handles this gracefully using the POST method with the required parameters.

Example

         In the below example we use the signup form of a website by supplying the userid and password value. After the submission of the values, we print the response.

import requests

ID_USERNAME = 'signup-user-name'
ID_PASSWORD = 'signup-user-password'
USERNAME = 'username'
PASSWORD = 'yourpassword'
SIGNUP_URL = 'http://codepad.org/login'
def submit_form():
    """Submit a form"""
    payload = {ID_USERNAME : USERNAME, ID_PASSWORD : PASSWORD,}

    resp = requests.get(SIGNUP_URL)
    print "Response to GET request: %s" %resp.content

    resp = requests.post(SIGNUP_URL, payload)
    print "Headers from a POST request response: %s" %resp.headers
#print "HTML Response: %s" %resp.read()

if __name__ == '__main__':
    submit_form()
When we run the above program, we get the following output
Response to GET request: 

Login - codepad

	.....................
	..................... 

    
    
    
    



    

Submitting a web form with requests

          The requests package does form submissions a little bit more elegantly. Let’s take a look:
# Python 2.x example
import requests

url = 'https://duckduckgo.com/html/'
payload = {'q':'python'}
r = requests.get(url, params=payload)
with open("requests_results.html", "w") as f:
    f.write(r.content)
With requests, you just need to create a dictionary with the field name as the key and the search term as the value. Then you use requests.post to do the search. Finally, you use the resulting requests object, “r”, and access its content property which you save to disk. We skipped the web browser part in this example (and the next) for brevity. Now we should be ready to see how to mechanize does its thing.
In Python 3, it should be noted that r.content now returns bytes instead of a string. This will cause a TypeError to be raised if we try to write that out to disc. To fix it, all we need to do is change the file flag from ‘w’ to ‘wb’, like this:
with open("requests_results.html", "wb") as f:

    f.write(r.content)
To know about Python- IP Address click here.
  
Like the Blog, then Share it with your friends and colleagues to make this AI community stronger. 
To learn more about nuances of Artificial Intelligence, Python Programming, Deep Learning, Data Science and Machine Learning, visit our insideAIML blog page.
Keep Learning. Keep Growing. 

Submit Review