# Lunch Time Python
## Lunch 1: Requests
*Scientific Software Center, Heidelberg University*  
*October 2021*  
*Visit on [GitHub](https://github.com/ssciwr/lunch-time-python)*  

Welcome to Lunch Time Python! This is the notebook for [session 1](https://ssciwr.github.io/lunch-time-python/lunchtime1/) - the [requests](https://docs.python-requests.org/en/latest/) library.

The requests library provides an elegant and simple way to send HTTP requests. Connect to the server of your choice, and download websites, stream data or upload content. Requests is [one of the most downloaded python packages](https://pypi.org/project/requests/) with about 14 Million downloads per week, and half a million of repositories that depend on requests as of October 2021.

# Requests: HTTP for humans

Carry out HTTP/1.1 requests using python! An HTTP request is made by a client to a server. For example, when you open a web page in your browser, your device sends a GET request to the web server hosting the page.

The HTTP request contains three elements in the start line: An HTTP method; the request target; and the HTTP version.

For example, when you open the page [ssc.iwr.uni-heidelberg.de](https://ssc.iwr.uni-heidelberg.de/), this is the message that is sent from the client to the server:

GET https://ssc.iwr.uni-heidelberg.de/ HTTP/1.1

The above request contains the request method, GET, the URI of the target, https://ssc.iwr.uni-heidelberg.de/, and the protocol version, HTTP/1.1.

**These are the [main methods](https://www.tutorialspoint.com/http/http_methods.htm) for HTTP/1.1:**
1. GET  
The GET method is used to retrieve information from the given server using a given URI. Requests using GET should only retrieve data and should have no other effect on the data.

1. HEAD  
Same as GET, but transfers the status line and header section only.

1. POST  
A POST request is used to send data to the server, for example, customer information, file upload, etc. using HTML forms.

1. PUT  
Replaces all current representations of the target resource with the uploaded content.

1. DELETE  
Removes all current representations of the target resource given by a URI.

1. CONNECT  
Establishes a tunnel to the server identified by a given URI.

1. OPTIONS  
Describes the communication options for the target resource.

1. TRACE  
Performs a message loop-back test along the path to the target resource.

*Let's start requesting!  
To install requests on your local machine, simply use* `python -m pip install requests`.

In [None]:
import requests as rq
import json  # to pretty-print JSON responses

We will start with the above example -  
GET https://ssc.iwr.uni-heidelberg.de/ HTTP/1.1

In [None]:
targetURI = "https://ssc.iwr.uni-heidelberg.de/"
r = rq.get(url=targetURI)

This did something! Let's check the object that we obtained.

In [None]:
r.status_code

There are a couple of status codes that are important. You are probably familiar with 404 Not Found; status codes starting with 2 stand for successful requests; status codes starting with 3 stand for redirections; codes starting with 4 stand for client-side errors.

In [None]:
targetURI = "https://en.wikipedia.org/wiki/Monty_Python"
r = rq.get(url=targetURI)

In [None]:
r.text

## The HTTP response
The response that you receive from the server contains the status line (as per `r.status_code`), the HTTP headers and a body. 

### The response header

In [None]:
r.headers

In [None]:
r.headers["content-type"]  # the dictionary is case-insensitive!

In [None]:
r.encoding  # the type of compression that is used

The headers contain information in the response headers (like host), the general headers (i.e. information about the connection), and representation headers (ie. content length).
You can also see what cookies were sent back, and how much time elapsed for the processing of the request.

In [None]:
r.cookies  # the cookies that the server sent back

In [None]:
r.elapsed  # time between request send and receiving the response

### The response body
Not all requests come with a body (the payload) - if for example you PUT data on a server, the response does not necessarily entail a body. You can look at the request's body using `r.text` (this one looks at textual data) or `r.content` (automatically detects the encoding also for non-text response content).

In [None]:
r.text

In [None]:
r.content

### Side note
This doesn't look too pretty - you can use BeautifulSoup (`pip install beautifulsoup4`) to improve it's appearance, but that library can fill up a whole other lunch time.

In [None]:
from bs4 import BeautifulSoup

In [None]:
soup = BeautifulSoup(r.content, "html.parser")
print(soup.prettify())

In [None]:
print(soup.text)

### Back to requests
Requests also has a built-in JSON decoder.

In [None]:
r = rq.get("https://api.github.com/events")
r.json()

# GET request with parameters
Now let's try to get something useful using requests (apart from that you can use it to crawl the web and download pages!). Let's find out the geographic position of Heidelberg University using [google's geocoding API](https://developers.google.com/maps/documentation/geocoding/overview?_gl=1*oagjnc*_ga*MTk0NjcwNTg2Ni4xNjM1MTUzNjc5*_ga_NRWSTWS78N*MTYzNTE1MzY3OC4xLjAuMTYzNTE1MzY3OC4w). For this, you can generate a trial account on google's website to obtain an API key.

In [None]:
# api-endpoint
URI = "https://maps.googleapis.com/maps/api/geocode/json"
# API key
key = "XXXXXXXXXXXXXXXXXXX"

The better practice is to store the key securely outside of the notebook (and adding the configuration file to .gitignore).

In [None]:
import yaml

with open("config.yml", "r") as ymlfile:
    cfg = yaml.safe_load(ymlfile)
key = cfg["google_api"]["secret_code"]

In [None]:
# location to geocode
location = "university of heidelberg"
country = "germany"
# defining a params dict for the parameters to be sent to the API
parameters = {"key": key, "address": location, "country": country}
# sending get request and saving the response as response object
r = rq.get(url=URI, params=parameters)

In [None]:
r.status_code

In [None]:
# extracting data in json format
data = r.json()

In [None]:
print(data)

In [None]:
# print this a little prettier
print(json.dumps(data, indent=4, sort_keys=True))

In [None]:
address_out = data["results"][0]["formatted_address"]
# printing the output
print("Address is {}.".format(address_out))

In [None]:
latitude = data["results"][0]["geometry"]["location"]["lat"]
longitude = data["results"][0]["geometry"]["location"]["lng"]
# printing the output
print("Latitude is {} and longitude {}.".format(latitude, longitude))

# Making a POST request
Again we need an account for this example. This time, we are using the service [pastebin](https://pastebin.com/). You can send text to this address and it will be publicly visible. It serves as a storage for textual data.

In [None]:
# defining the api-endpoint
api_endpoint = "https://pastebin.com/api/api_post.php"
# API key
key = "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"

In [None]:
key = cfg["pastebin_api"]["secret_code"]

In [None]:
# the API option
option = "paste"
# name/title of your paste
api_paste_name = "lunch time python"
# syntax highlighting
api_format = "python"
# this makes a paste public, unlisted or private, public = 0, unlisted = 1, private = 2
private = 0
# the text you want to paste, for example, a code snippet in python
text = """
print("Hello, lunch time!")
x = 'my lunch'
y = 'your lunch'
print('{} {}'.format(x, y))
"""
# data dictionary, to be sent to api
data = {
    "api_dev_key": key,
    "api_option": option,
    "api_paste_code": text,
    "api_paste_format": api_format,
    "api_paste_private": private,
}

# sending post request and saving response as response object
r = rq.post(url=api_endpoint, data=data)

In [None]:
r.status_code

In [None]:
# extracting response text
pastebin_url = r.text
print("The pastebin URL is {}".format(pastebin_url))

# Making a PUT request
A PUT request is similar to a POST request, but it is *idempotent*. This means, that in a PUT request the target is replaced. In a POST request, the target appears multiple times. In the above example from pastebin, a POST request generates a new paste, while a PUT request would replace/alter a paste. For the differences between HTTP methods, see [here](https://www.w3schools.com/tags/ref_httpmethods.asp).

For the PUT example, we will use [httpbin](https://httpbin.org/). This is an open service that allows you to test API calls and authetication methods.

In [None]:
# the api-endpoint
api_endpoint = "https://httpbin.org/put"
# the data to send - we want to receive a JSON response
data_type = "application/json"
# storing in a dictionary
data = {"accept": data_type}
# Making a PUT request
r = rq.put(url=api_endpoint, data=data)

In [None]:
# check status code for response received
print(r)
print("*************************")
print(r.status_code)
print("*************************")
# print content of request
print(r.content)
print("*************************")
# print recognizing the json response of the request
print(r.json())
print("*************************")
# print this a little prettier
print(json.dumps(r.json(), indent=4, sort_keys=True))

# Advanced topics
There is so much more you can do with requests - for example:
- [sessioning](https://docs.python-requests.org/en/latest/user/advanced/#session-objects) which allows you to re-use the connection to the server (through session pooling, leading to faster requests); 
- [SSL certificate verification](https://docs.python-requests.org/en/latest/user/advanced/#ssl-cert-verification) which allows you to validate the requests;
- [streaming](https://docs.python-requests.org/en/latest/user/advanced/#streaming-requests); 
- and [much more](https://docs.python-requests.org/en/latest/user/advanced/)!