Python os.cpu_count 函数

上次修改时间：2025 年 4 月 11 日

本综合指南探讨了 Python 的 os.cpu_count 函数，该函数检测可用 CPU 核心的数量。我们将介绍基本用法、多进程应用程序和实际示例。

基本定义

os.cpu_count 函数返回系统中 CPU 的数量。如果无法确定，则返回 None。这有助于优化并行处理。

该计数包括物理核心和逻辑核心。在具有超线程的系统上，逻辑核心的数量可能是物理核心数量的两倍。

基本 CPU 计数检测

os.cpu_count 最简单的用法是检索可用 CPU 核心的总数。这为资源分配提供系统信息。

basic_count.py

import os

# Get CPU count
cpu_count = os.cpu_count()

if cpu_count is not None:
    print(f"Number of CPU cores: {cpu_count}")
else:
    print("Could not determine CPU count")

此示例显示了基本使用模式。如果无法确定计数，该函数可能会返回 None，因此请始终检查这种情况。

结果因系统配置而异，并且可能包括超线程核心作为受支持硬件上的单独 CPU。

设置多进程池大小

os.cpu_count 通常用于为多进程设置最佳池大小。此示例演示了创建一个池，每个 CPU 核心对应一个工作进程。

multiprocessing_pool.py

import os
import multiprocessing

def worker(num):
    return num * num

if __name__ == '__main__':
    # Use all available cores
    pool_size = os.cpu_count() or 1
    
    with multiprocessing.Pool(pool_size) as pool:
        results = pool.map(worker, range(10))
        print(results)

这将创建一个多进程池，每个 CPU 核心对应一个工作进程。“or 1”回退可确保在 cpu_count 返回 None 时至少有一个工作进程。

使用所有可用核心可以最大限度地提高 CPU 密集型任务的并行处理效率。

针对超线程进行调整

某些工作负载受益于仅使用物理核心。此示例展示了如何通过将逻辑计数减半来估计物理核心计数。

physical_cores.py

import os

def get_physical_cores():
    logical_cores = os.cpu_count() or 1
    # Assume hyper-threading doubles core count
    return max(1, logical_cores // 2)

print(f"Logical cores: {os.cpu_count()}")
print(f"Estimated physical cores: {get_physical_cores()}")

这通过将逻辑计数减半来提供物理核心的粗略估计。在某些系统上，实际物理核心计数可能有所不同。

对于精确的物理核心检测，可能需要特定于平台的工具，例如 Linux 上的 lscpu。

线程池执行器配置

os.cpu_count 有助于配置 ThreadPoolExecutor 以实现最佳并行性。此示例演示了 CPU 密集型任务分配。

thread_pool.py

import os
from concurrent.futures import ThreadPoolExecutor

def cpu_intensive(n):
    return sum(i*i for i in range(n))

if __name__ == '__main__':
    max_workers = (os.cpu_count() or 1) * 2  # I/O bound multiplier
    with ThreadPoolExecutor(max_workers) as executor:
        results = list(executor.map(cpu_intensive, [10**6]*10))
        print(f"Completed {len(results)} tasks with {max_workers} workers")

对于 CPU 密集型任务，工作进程计数通常与 CPU 核心匹配。对于 I/O 密集型任务，乘以一个因子（如 2-4 倍）可能会提高吞吐量。

最佳工作进程计数取决于任务特征和系统资源。

自定义资源分配

此示例展示了在为其他进程保留 CPU 核心的同时，如何使用剩余核心来运行你的应用程序。

resource_allocation.py

import os

def allocate_cores(reserve=1):
    total = os.cpu_count() or 1
    available = max(1, total - reserve)
    print(f"Total cores: {total}")
    print(f"Allocating {available} cores (reserving {reserve})")
    return available

# Reserve 2 cores for system processes
cores_to_use = allocate_cores(2)
print(f"Using {cores_to_use} cores for processing")

当与其他关键进程一起运行时，此模式非常有用。 reserve 参数指定要保留多少个核心可用。

始终确保至少分配一个核心，以防止在极端情况下出现零工作进程的情况。

跨平台 CPU 检测

此示例演示了当 os.cpu_count 返回 None 时，特定于平台的 CPU 计数检测作为后备方案。

cross_platform.py

import os
import platform
import subprocess

def get_cpu_count():
    count = os.cpu_count()
    if count is not None:
        return count
    
    system = platform.system()
    try:
        if system == "Linux":
            return int(subprocess.check_output("nproc", shell=True))
        elif system == "Windows":
            return int(os.environ["NUMBER_OF_PROCESSORS"])
        elif system == "Darwin":
            return int(subprocess.check_output("sysctl -n hw.ncpu", shell=True))
    except:
        return 1

print(f"Detected CPU cores: {get_cpu_count()}")

当 os.cpu_count 失败时，这为 Linux (nproc)、Windows (环境变量) 和 macOS (sysctl) 提供了后备方法。 try-except 确保至少返回 1 个核心。

在某些极端情况下，特定于平台的方法可能会提供更准确的计数。

性能基准测试

此示例使用 os.cpu_count 来配置一个性能基准测试，该基准测试可根据可用 CPU 资源进行扩展。

benchmark.py

import os
import time
import multiprocessing

def stress_test(duration):
    end = time.time() + duration
    while time.time() < end:
        pass

if __name__ == '__main__':
    duration = 5  # seconds per core
    cores = os.cpu_count() or 1
    print(f"Running stress test on {cores} cores for {duration} seconds each")
    
    processes = []
    for _ in range(cores):
        p = multiprocessing.Process(target=stress_test, args=(duration,))
        p.start()
        processes.append(p)
    
    for p in processes:
        p.join()
    
    print("Benchmark completed")

这为每个核心创建一个 CPU 密集型进程，持续指定的时间。基准测试工作负载会自动随可用 CPU 资源进行扩展。

此类基准测试有助于评估系统在完全 CPU 负载条件下的性能。

安全注意事项

资源限制： 容器/虚拟机可能会报告主机 CPU 计数
动态伸缩： 云环境可能会更改 CPU 计数
亲和性掩码： 进程可能仅限于 CPU 的子集
节能： 某些核心可能会为了节能而离线
虚拟核心： 计数可能无法反映实际性能

最佳实践

始终检查 None： 提供回退值（通常为 1）
考虑工作负载： 根据任务类型调整工作进程计数
保留资源： 为系统进程预留核心
监控利用率： 根据实际性能进行调整
记录假设： 记录 CPU 计数依赖项

资料来源

作者

我叫 Jan Bodnar，是一位充满热情的程序员，拥有丰富的编程经验。自 2007 年以来，我一直在撰写编程文章。迄今为止，我已撰写了超过 1,400 篇文章和 8 本电子书。我拥有超过十年的编程教学经验。

列出所有 Python 教程。