Python httpx

最后修改于 2024 年 1 月 29 日

Python httpx 教程展示了如何使用 httpx 模块在 Python 中创建 HTTP 请求。 httpx 允许创建同步和异步 HTTP 请求。

httpx 模块

HTTPX 是一个用于 Python 3 的 HTTP 客户端，它提供同步和异步 API，并支持 HTTP/1.1 和 HTTP/2。它具有与流行的 Python requests 库类似的 API。 HTTPX 需要 Python 3.6+。

$ pip install httpx

我们使用 pip 命令安装该模块。

httpx 支持异步 Web 请求。通过 httpx 和 asyncio 模块以及 async 和 await 关键字的组合，我们可以生成异步 Web 请求。这可能会大大提高我们程序的效率。

HTTP

超文本传输协议 (HTTP) 是一种用于分布式、协作式、超媒体信息系统的应用协议。HTTP 是万维网数据通信的基础。

Python httpx 状态码

在第一个例子中，我们确定网页的状态。状态码由 status_code 属性确定。

sync_status.py

#!/usr/bin/python

import httpx 

r = httpx.head('http://webcode.me')
print(r.status_code)

该示例创建一个对 webcode.me 网站的同步 HEAD 请求并检索一个 http 响应。从响应中，我们获取状态码。

$ ./sync_status.py 
200

Python httpx GET 请求

以下示例创建一个同步 GET 请求。

sync_get.py

#!/usr/bin/python

import httpx 

r = httpx.get('http://webcode.me')
print(r.text)

我们使用 httpx.get 方法生成对网页的 GET 请求。该页面被检索并将其 HTML 代码打印到控制台。

$ ./sync_get.py 
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>My html page</title>
</head>
<body>

    <p>
        Today is a beautiful day. We go swimming and fishing.
    </p>
    
    <p>
         Hello there. How are you?
    </p>
    
</body>
</html>

有一个 params 选项可以随请求一起发送查询参数。

sync_query_params.py

#!/usr/bin/python

import httpx 

payload = {'name': 'John Doe', 'occupation': 'gardener'}
r = httpx.get('https://httpbin.org/get', params = payload)
print(r.text)

该示例发送带有 GET 请求的查询参数。

$ ./sync_query_params.py 
{
    "args": {
    "name": "John Doe", 
    "occupation": "gardener"
    }, 
    "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate, br", 
    "Host": "httpbin.org", 
    "User-Agent": "python-httpx/0.16.1", 
    "X-Amzn-Trace-Id": "Root=1-600817ec-25cb3dea461b3e7a6f21df27"
    }, 
    ...
    "url": "https://httpbin.org/get?name=John+Doe&occupation=gardener"
}

Python httpx POST 表单请求

POST 请求通过 httpx.post 方法生成。

使用 application/x-www-form-urlencoded 时，数据在请求的正文中发送；键和值以键值对的形式编码，用 '&' 分隔，键和值之间用 '=' 分隔。

sync_post_form.py

#!/usr/bin/python

import httpx 

payload = {'name': 'John Doe', 'occupation': 'gardener'}

r = httpx.post('https://httpbin.org/post', data=payload)
print(r.text)

我们生成一个带有 FORM 数据到 httpbin.org/post 的同步 POST 请求。有效负载设置为 data 选项。

$ ./sync_post_form.py 
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {
    "name": "John Doe", 
    "occupation": "gardener"
  }, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate, br", 
    "Content-Length": "33", 
    "Content-Type": "application/x-www-form-urlencoded", 
    "Host": "httpbin.org", 
    "User-Agent": "python-httpx/0.16.1", 
    "X-Amzn-Trace-Id": "Root=1-600819fd-5e7b28a97b2484c8438a6f2e"
  }, 
  "json": null, 
  ... 
  "url": "https://httpbin.org/post"
}

Python httpx 流数据

对于较大的下载，我们可以流式传输响应，而无需一次将整个响应正文加载到内存中。对于流式传输，我们使用 httpx.stream 方法。

sync_stream.py

#!/usr/bin/python

import httpx

url = 'https://download.freebsd.org/ftp/releases/amd64/amd64/ISO-IMAGES/12.0/FreeBSD-12.0-RELEASE-amd64-mini-memstick.img'

with open('FreeBSD-12.0-RELEASE-amd64-mini-memstick.img', 'wb') as f:

    with httpx.stream('GET', url) as r:

        for chunk in r.iter_bytes():
            f.write(chunk)

该示例下载 FreeBSD OS 的图像。我们使用 iter_bytes 方法迭代二进制内容。

Python httpx 异步 GET 请求

以下示例生成一个简单的异步 GET 请求。

async_get.py

#!/usr/bin/python

import httpx
import asyncio

async def main():
    async with httpx.AsyncClient() as client:
        r = await client.get('http://test.webcode.me')
        print(r.text)

asyncio.run(main())

在示例中，我们异步检索一个小的 HTML 页面。

import httpx
import asyncio

我们需要导入 httpx 和 asyncio 模块。 asyncio 模块是一个使用 async/await 语法编写并发代码的库；它通常非常适合 IO 密集型任务。

async def main():

通过 async def，我们创建一个协程。协程用于协作式多任务处理。

async with httpx.AsyncClient() as client:

我们创建一个异步 HTTP 客户端。该库还必须支持异步编程模型。

r = await client.get('http://test.webcode.me')

通过 await 关键字，get 协程启动，然后程序恢复执行；协程将执行控制权返回给事件循环。因此程序不会暂停等待请求。当请求到达时，事件循环会继续等待协程的位置。

注意： await 关键字必须始终在 async 函数中使用。

asyncio.run(main())

run 方法启动事件循环并调用 main 协程。事件循环是注册、执行和取消异步函数的中心点。

Python httpx 多个异步 GET 请求

asyncio.gather 函数并发运行协程。

async_mul_get.py

#!/usr/bin/python

import httpx
import asyncio

async def get_async(url):
    async with httpx.AsyncClient() as client:
        return await client.get(url)

urls = ["http://webcode.me", "https://httpbin.org/get"]

async def launch():
    resps = await asyncio.gather(*map(get_async, urls))
    data = [resp.text for resp in resps]
    
    for html in data:
        print(html)

asyncio.run(launch())

该示例生成两个异步 GET 请求。

async def get_async(url):
     async with httpx.AsyncClient() as client:
         return await client.get(url)

这是生成异步 GET 请求的协程。

urls = ["http://webcode.me", "https://httpbin.org/get"]

我们有两个 URL。

async def launch():
     resps = await asyncio.gather(*map(get_async, urls))
     data = [resp.text for resp in resps]
     
     for html in data:
         print(html)

通过内置的 map 函数，我们将 get_async 函数应用于 URL 列表。返回的列表使用 *（星号）运算符解包为位置参数。如果所有协程都成功完成，则结果是返回值的聚合列表（HTML 代码）。

Python httpx 异步 POST 表单请求

以下示例演示如何发送带有表单数据的异步 POST 请求。

async_post_form.py

#!/usr/bin/python

import httpx
import asyncio

async def main():

    data = {'name': 'John Doe', 'occupation': 'gardener'}

    async with httpx.AsyncClient() as client:
        r = await client.post('https://httpbin.org/post', data=data)
        print(r.text)

asyncio.run(main())

数据传递到 post 协程的 data 选项。

Python httpx 异步 POST JSON 请求

以下示例演示如何发送带有 JSON 数据的异步 POST 请求。

post_json.py

#!/usr/bin/python

import httpx
import asyncio

async def main():

    data = {'int': 123, 'boolean': True, 'list': ['a', 'b', 'c']}

    async with httpx.AsyncClient() as client:
        r = await client.post('https://httpbin.org/post', json=data)
        print(r.text)

asyncio.run(main())

数据设置为 post 协程的 json 选项。

Python httpx 异步流请求

该示例演示如何以异步流下载大型二进制文件。

async_stream.py

#!/usr/bin/python

import httpx
import asyncio

url = 'https://download.freebsd.org/ftp/releases/amd64/amd64/ISO-IMAGES/12.0/FreeBSD-12.0-RELEASE-amd64-mini-memstick.img'


async def main():
    with open('FreeBSD-12.0-RELEASE-amd64-mini-memstick.img', 'wb') as f:

        async with httpx.AsyncClient() as client:
            async with client.stream('GET', url) as r:

                async for chunk in r.aiter_bytes():
                    f.write(chunk)

asyncio.run(main())

我们使用 client.stream 和 aiter_bytes 函数。

比较同步和异步请求

在以下两个示例中，我们比较一组同步和异步请求的效率。通过 time 模块，我们计算经过的时间。

multiple_sync.py

#!/usr/bin/python

import httpx
import time

urls = ['http://webcode.me', 'https://httpbin.org/get', 
    'https://google.com', 'https://stackoverflow.com', 
    'https://github.com', 'https://mozilla.org']

start_time = time.monotonic()

for url in urls:
    r = httpx.get(url)
    print(r.status_code)

print(f'Elapsed: {time.monotonic() - start_time}')

我们生成六个同步 GET 请求并计算经过的时间。

multiple_async.py

#!/usr/bin/python

import httpx
import asyncio
import time

async def get_async(url):
    async with httpx.AsyncClient() as client:
        return await client.get(url)

urls = ['http://webcode.me', 'https://httpbin.org/get', 
    'https://google.com', 'https://stackoverflow.com', 
    'https://github.com']

async def launch():
    resps = await asyncio.gather(*map(get_async, urls))
    data = [resp.status_code for resp in resps]
    
    for status_code in data:
        print(status_code)

start_time = time.monotonic()
asyncio.run(launch())
print(f'Elapsed: {time.monotonic() - start_time}')

我们生成六个异步 GET 请求。

$ ./multiple_async.py 
200
200
200
200
200
Elapsed: 0.935432159982156
$ ./multiple_sync.py 
200
200
200
200
200
200
Elapsed: 3.5428215700085275

在我们的例子中，差异超过 2.5 秒。

来源

Python httpx 文档

在本文中，我们使用 httpx 模块在 Python 中生成了同步和异步 Web 请求。

作者

我叫 Jan Bodnar，是一位充满激情的程序员，拥有丰富的编程经验。自 2007 年以来，我一直在撰写编程文章。迄今为止，我已经撰写了 1,400 多篇文章和 8 本电子书。我拥有超过十年的编程教学经验。

列出所有 Python 教程。