python 在多个 CPU 核心上传播 subprocess.call

标签 python subprocess cpu-usage multiprocess

我有一个使用 subprocess 包在 shell 中运行的 python 代码:

subprocess.call(mycode.py, shell=inshell)

当我执行 top 命令时，我发现我只使用了大约 30% 或更少的 CPU。我意识到有些命令可能使用磁盘而不是CPU，因此我正在计时速度。在linux系统上运行这个速度似乎比mac 2 core系统慢。

如何将其与线程或多处理包并行化，以便我可以在所述 Linux 系统上使用多个 CPU 核心？

最佳答案

要并行化 mycode.py 中完成的工作，您需要组织代码以使其符合以下基本模式:

# Import the kind of pool you want to use (processes or threads).
from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadPool

# Collect work items as an iterable of single values (eg tuples, 
# dicts, or objects). If you can't hold all items in memory,
# define a function that yields work items instead.
work_items = [
    (1, 'A', True),
    (2, 'X', False),
    ...
]

# Define a callable to do the work. It should take one work item.
def worker(tup):
    # Do the work.
    ...

    # Return any results.
    ...

# Create a ThreadPool (or a process Pool) of desired size.
# What size? Experiment. Slowly increase until it stops helping.
pool = ThreadPool(4)

# Do work and collect results.
# Or use pool.imap() or pool.imap_unordered().
work_results = pool.map(worker, work_items)

# Wrap up.
pool.close()
pool.join()

---------------------

# Or, in Python 3.3+ you can do it like this, skipping the wrap-up code.
with ThreadPool(4) as pool:
    work_results = pool.map(worker, work_items)

关于python 在多个 CPU 核心上传播 subprocess.call，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/41626089/

上一篇：python - 如何使用 Py Dict SetDefault 来获取默认值的引用计数

下一篇：python - 一次性处理 django View 中的所有模板问题

相关文章：

Python 子进程/Popen 标准输出被截断

php - 如何获得 PHP 中使用的实际 CPU 时间？

linux - Linux 上应用程序的环境性能参数

c# - 在 C# 中异步检索 Cpu 使用情况

python - IOError : [Errno 24] Too many open files:

python - 通过提取 DatetimeIndex 的时间而不使用 for 循环，在一个图上绘制每一天

python - 为什么 subprocess.Popen 类没有命名为 Subprocess？

python - 子进程未完成但脚本运行正确

python - 将 Numpy 二维数组转换为字典映射列表

python - Multiprocessing pool.join() 在某些情况下挂起