python - python中的多处理-forkserver进程从父进程继承了什么？

我正在尝试使用 forkserver我遇到了NameError: name 'xxx' is not defined在工作进程中。
我使用的是 Python 3.6.4，但文档应该相同，来自 https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods它说:

The fork server process is single threaded so it is safe for it to use os.fork(). No unnecessary resources are inherited.

此外，它还说:

Better to inherit than pickle/unpickle

When using the spawn or forkserver start methods many types from multiprocessing need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

显然，我的工作进程需要处理的关键对象没有被服务器进程继承然后传递给工作人员，为什么会发生这种情况？我想知道 forkserver 进程究竟从父进程继承了什么？
这是我的代码的样子:

import multiprocessing
import (a bunch of other modules)

def worker_func(nameList):
    global largeObject
    for item in nameList:
        # get some info from largeObject using item as index
        # do some calculation
        return [item, info]

if __name__ == '__main__':
    result = []
    largeObject # This is my large object, it's read-only and no modification will be made to it.
    nameList # Here is a list variable that I will need to get info for each item in it from the largeObject    
    ctx_in_main = multiprocessing.get_context('forkserver')
    print('Start parallel, using forking/spawning/?:', ctx_in_main.get_context())
    cores = ctx_in_main.cpu_count()
    with ctx_in_main.Pool(processes=4) as pool:
        for x in pool.imap_unordered(worker_func, nameList):
            result.append(x)

谢谢!
最好的，

最佳答案

理论
以下是 Bojan Nikolic blog 的摘录

Modern Python versions (on Linux) provide three ways of starting the separate processes:

Fork()-ing the parent processes and continuing with the same processes image in both parent and child. This method is fast, but potentially unreliable when parent state is complex

Spawning the child processes, i.e., fork()-ing and then execv to replace the process image with a new Python process. This method is reliable but slow, as the processes image is reloaded afresh.

The forkserver mechanism, which consists of a separate Python server with that has a relatively simple state and which is fork()-ed when a new processes is needed. This method combines the speed of Fork()-ing with good reliability (because the parent being forked is in a simple state).

Forkserver

The third method, forkserver, is illustrated below. Note that children retain a copy of the forkserver state. This state is intended to be relatively simple, but it is possible to adjust this through the multiprocess API through the set_forkserver_preload() method.

实践
因此，如果您希望子进程从父进程继承某些东西，则必须在 中指定。 fork 服务器 通过 set_forkserver_preload(modules_names) 声明，它设置了要尝试在 forkserver 进程中加载的模块名称列表。我在下面举一个例子:

# inherited.py
large_obj = {"one": 1, "two": 2, "three": 3}

# main.py
import multiprocessing
import os
from time import sleep

from inherited import large_obj


def worker_func(key: str):
    print(os.getpid(), id(large_obj))
    sleep(1)
    return large_obj[key]


if __name__ == '__main__':
    result = []
    ctx_in_main = multiprocessing.get_context('forkserver')
    ctx_in_main.set_forkserver_preload(['inherited'])
    cores = ctx_in_main.cpu_count()
    with ctx_in_main.Pool(processes=cores) as pool:
        for x in pool.imap(worker_func, ["one", "two", "three"]):
            result.append(x)
    for res in result:
        print(res)

输出:

# The PIDs are different but the address is always the same
PID=18603, obj id=139913466185024
PID=18604, obj id=139913466185024
PID=18605, obj id=139913466185024

如果我们不使用预加载

...
    ctx_in_main = multiprocessing.get_context('forkserver')
    # ctx_in_main.set_forkserver_preload(['inherited']) 
    cores = ctx_in_main.cpu_count()
...

# The PIDs are different, the addresses are different too
# (but sometimes they can coincide)
PID=19046, obj id=140011789067776
PID=19047, obj id=140011789030976
PID=19048, obj id=140011789030912

关于python - python中的多处理-forkserver进程从父进程继承了什么？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/63424251/

python - python中的多处理-forkserver进程从父进程继承了什么？

Forkserver

上一篇：python - 为 Pandas 提供 python iterable 与 pd.Series for column 的区别

下一篇：android - 如何使用 Hilt 在 Room TypeConvertors 中注入(inject) Moshi/Gson？