我有一个使用 subprocess 包在 shell 中运行的 python 代码:
subprocess.call(mycode.py, shell=inshell)
当我执行 top 命令时,我发现我只使用了大约 30% 或更少的 CPU。 我意识到有些命令可能使用磁盘而不是CPU,因此我正在计时速度。 在linux系统上运行这个速度似乎比mac 2 core系统慢。
如何将其与线程或多处理包并行化,以便我可以在所述 Linux 系统上使用多个 CPU 核心?
最佳答案
要并行化 mycode.py
中完成的工作,您需要组织代码以使其符合以下基本模式:
# Import the kind of pool you want to use (processes or threads).
from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadPool
# Collect work items as an iterable of single values (eg tuples,
# dicts, or objects). If you can't hold all items in memory,
# define a function that yields work items instead.
work_items = [
(1, 'A', True),
(2, 'X', False),
...
]
# Define a callable to do the work. It should take one work item.
def worker(tup):
# Do the work.
...
# Return any results.
...
# Create a ThreadPool (or a process Pool) of desired size.
# What size? Experiment. Slowly increase until it stops helping.
pool = ThreadPool(4)
# Do work and collect results.
# Or use pool.imap() or pool.imap_unordered().
work_results = pool.map(worker, work_items)
# Wrap up.
pool.close()
pool.join()
---------------------
# Or, in Python 3.3+ you can do it like this, skipping the wrap-up code.
with ThreadPool(4) as pool:
work_results = pool.map(worker, work_items)
关于python 在多个 CPU 核心上传播 subprocess.call,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41626089/