Python API Querying and Scrapping
Updated at
| API Querying | Command |
|---|---|
| Import module | import requests |
| GET Request | requests.get(url, params={}, headers={}) |
| POST Request | requests.post(url, json=payload) |
| PUT/PATCH Request | requests.patch(url, json=payload) |
| DELETE Request | requests.delete(url) |
| Status | response.status_code |
| Content in String | response.content |
| Request/Response in JSON | response.json() |
| Content-Type | response.headers['content-type'] |
| Scrapping | Command |
|---|---|
| Import module | from bs4 import BeautifulSoup |
| Initialize the parser | parser = BeautifulSoup(response_content, 'html.parser') |
| Get the body tag | parser.body |
| Get the inside text of a tag | parser.head.title.text |
| Find specific tags | parser.body.find_all('p', id='i', class_='c') |
| Find all tags by selectors | parser.body.select('.c') |
| Regex | Command |
|---|---|
| Python module | import re | re.search(pattern, string) | re.findall(patttern, string) |
| Regex pattern check | s.str.contains(r'', na=False, flags=re.IGNORECASE) | IGNORECASE = I |
| Regex pattern extract | s.str.extract(r'', expand=True, flags) | expand returns df |
| Regex pattern replace | s.str.replace(r'', replace, flags) |
| Regex all patterns extract | s.str.extractall(r'') |