python - 并行独立进程

我想在循环中启动多个进程，但由于它们需要很长时间才能完成，我认为并行运行它们可能会更好。所有这些过程都是独立的，即它们不依赖于彼此的结果。这是一个小例子，说明了我正在处理的循环类型:

inDir = '/path/to/your/dir/'
inTxtList = ['a.txt','b.txt','c.txt','d.txt','e.txt']
for i in inTxtList:
    myfile = open(i,'w')
    myfile.write("This is a text file written in python\n")
    myfile.close()

我尝试了 multiprocessing 包并得出以下代码:

import multiprocessing

def worker(num):
    """thread worker function"""
    myfile = open(num,'w')
    myfile.write("This is my first text file written in python\n")
    myfile.close()
    return

if __name__ == '__main__':
    jobs = []
    for i in inTxtList:
        p = multiprocessing.Process(target=worker, args=(inDir+i,))
        jobs.append(p)
        p.start()
        p.join()

它确实有效，但我不知道如何设置 worker 数量。你能帮我吗？

最佳答案

使用multiprocessing.Pool.map 。您可以在创建 Pool 对象时通过指定 processes 参数来指定工作线程数量:

import os
import multiprocessing

def worker(num):
    with open(num, 'w') as f:
        f.write("This is my first text file written in python\n")

if __name__ == '__main__':
    number_of_workers = 4
    pool = multiprocessing.Pool(processes=number_of_workers)
    pool.map(worker, [os.path.join(inDir, i) for i in inTxtList])
    pool.close()
    pool.join()

顺便说一句，使用 os.path.join而不是手动连接路径组件。

关于python - 并行独立进程，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20978577/

上一篇：python - 我如何确保为层次结构中的每个类调用一个方法(一次，如果存在)？

下一篇：Windows 上的 Python GTK CSS 问题

相关文章：

c# - 解耦(并行处理)Web 应用程序的非即时进程的最佳方式？

python - 在 python 中从 ElementTree 转义 xml 文本

Python bs4 : How to Repeat “For” Loop with a Different Scraped Page if a Certain Condition is Met?

javascript - 使用 HTTP 请求关闭 for 循环

r - 在 R 中创建一个循环，该循环也会更改列名称

concurrency - 与 google go 中的 channel 共享资源

parallel-processing - 我可以运行多少个并行进程？

python - 使用不同的行终止符在 Python 中读取 csv 文件

python - 为什么 pygame.MOUSEBUTTONDOWN 永远不等于我的按钮，即使用户单击它？

c - 如何从 C 文件中删除最后 7 个字节？