import requests
url = "http://lewism.net"
response = requests.get(url)
print response.content
如果执行成功的话,你就可以看到我的主页内容源代码了~就是这么简单,四行代码就可以把我的主页内容爬下来。
>>> url = 'https://api.github.com/some/endpoint'
>>> headers = {'user-agent': 'my-app/0.0.1'}
>>> r = requests.get(url, headers=headers)
>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)
>>> r.cookies['example_cookie_name']
'example_cookie_value'
To send your own cookies to the server, you can use the cookies parameter:
>>> url = 'http://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
requests.get('http://example.org', proxies=proxies)
You can also configure proxies by setting the environment variables HTTP_PROXY and HTTPS_PROXY.
$ export HTTP_PROXY="http://10.10.1.10:3128"
$ export HTTPS_PROXY="http://10.10.1.10:1080"
$ python
>>> import requests
>>> requests.get('http://example.org')
end(ps:以上代码全部来自官方文档,我觉得比较常用所以记录一下~)