python - 在焕然一新的 Python 环境中以编程方式从 Python 内部执行 Python 文件

标签 python python-import python-importlib python-exec pyfakefs

假设我有一个文件 script.py 位于 path = "foo/bar/script.py"。我正在寻找一种在 Python 中通过函数 execute_script() 从我的主要 Python 程序中以编程方式执行 script.py 的方法。但是,我有一些要求似乎阻止我采用涉及 importlibexec() 的幼稚方法:

  • script.py 应该在“看起来很新鲜”的 Python 环境中执行,就好像它是通过 $ python script.py 运行的一样。也就是说,所有相关的全局变量,如 __name____file__sys.modulessys.path 和工作目录应该相应地设置并且尽可能少的信息应该从我的主程序泄漏到文件的执行中。 (没关系,不过,如果 script.py 可以通过 inspect 模块发现它不是通过 $ python script.py 直接。)

  • 我需要访问执行结果,即 execute_script() 应该返回 script.py 给出的模块及其所有变量、函数和类。 (这可以防止在子进程中启动新的 Python 解释器。)

  • execute_script() 必须在内部使用 open() 来读取 script.py。这样我就可以在单元测试期间使用 pyfakefs 包来模拟文件系统。 (这阻止了涉及 importlib 的简单解决方案。)

  • execute_script() 不得(永久)修改我的主程序中的任何全局状态,例如 sys.pathsys.modules.

  • 如果可能,script.py 应该不能影响我的主程序的全局状态。 (至少它应该不能影响我主程序中的sys.pathsys.modules。)

  • 我需要能够修改 script.py 看到的 sys.pathexecute_function() 因此应该接受一个可选的系统路径列表作为参数。

  • script.py 执行期间发生的堆栈跟踪和错误处理应该照常进行。 (这使得涉及 exec() 的解决方案变得困难。)

  • 解决方案应尽可能面向 future ,并且不依赖于 Python 解释器的实现细节。

如果有任何想法,我将不胜感激!

最佳答案

我刚刚发现 exec() 也接受代码对象(例如可以从 compile() 获得)并提出了一种方法似乎满足几乎所有要求。 “几乎”是因为除了 sys.pathsys.modules 脚本仍然可以影响主程序的全局状态。此外,它还可以查看在调用 execute_script() 之前导入的所有模块。不过,目前我对此很满意。

这是包括测试在内的完整代码:

import os
import sys
from typing import List


module = os.__class__


def create_module(name: str, file: str) -> module:
    mod = module(name)
    # Instances of `module` automatically come with properties __doc__,
    # __loader__, __name__, __package__ and __spec___. Let's add some
    # more properties that main modules usually come with:

    mod.__annotations__ = {}
    # __builtins__ doesn't show up in dir() but still exists
    mod.__builtins__ = __builtins__
    mod.__file__ = file

    return mod


def exec_script(path: str, working_dir: str, syspath: List[str] = None) -> module:
    """
    Execute a Python script as if it were executed using `$ python
    <path>` from inside the given working directory. `path` can either
    be an absolute path or a path relative to `working_dir`.

    If `syspath` is provided, a copy of it will be used as `sys.path`
    during execution. Otherwise, `sys.path` will be set to
    `sys.path[1:]` which – assuming that `sys.path` has not been
    modified so far – removes the working directory from the time when
    the current Python program was started. Either way, the directory
    containing the script at `path` will always be added at position 0
    in `sys.path` afterwards, so as to simulate execution via `$ python
    <path>`.
    """

    if os.path.isabs(path):
        abs_path = path
    else:
        abs_path = os.path.join(os.path.abspath(working_dir), path)

    with open(abs_path, "r") as f:
        source = f.read()

    if sys.version_info < (3, 9):
        # Prior to Python 3.9, the __file__ variable inside the main
        # module always contained the path exactly as it was given to `$
        # python`, no matter whether it is relative or absolute and/or a
        # symlink.
        the__file__ = path
    else:
        # Starting from Python 3.9, __file__ inside the main module is
        # always an absolute path.
        the__file__ = abs_path

    # The filename passed to compile() will be used in stack traces and
    # error messages. It normally it agrees with __file__.
    code = compile(source, filename=the__file__, mode="exec")

    sysmodules_backup = sys.modules
    sys.modules = sys.modules.copy()
    the_module = create_module(name="__main__", file=the__file__)
    sys.modules["__main__"] = the_module

    # According to
    # https://docs.python.org/3/tutorial/modules.html#the-module-search-path
    # if the script is a symlink, the symlink is followed before the
    # directory containing the script is added to sys.path.
    if os.path.islink(abs_path):
        sys_path_dir = os.path.dirname(os.readlink(abs_path))
    else:
        sys_path_dir = os.path.dirname(abs_path)

    if syspath is None:
        syspath = sys.path[1:]
    syspath_backup = sys.path
    sys.path = [
        sys_path_dir
    ] + syspath  # This will automatically create a copy of syspath

    cwd_backup = os.getcwd()
    os.chdir(working_dir)

    # For code inside a module, global and local variables are given by
    # the *same* dictionary
    globals_ = the_module.__dict__
    locals_ = the_module.__dict__
    exec(code, globals_, locals_)

    os.chdir(cwd_backup)
    sys.modules = sysmodules_backup
    sys.path = syspath_backup

    return the_module


#################
##### Tests #####
#################

# Make sure to install pyfakefs via pip!

import unittest

import pyfakefs


class Test_exec_script(pyfakefs.fake_filesystem_unittest.TestCase):
    def setUp(self):
        self.setUpPyfakefs()
        self.fs.create_file(
            "/folder/script.py",
            contents="\n".join(
                [
                    "import os",
                    "import sys",
                    "",
                    "cwd = os.getcwd()",
                    "sysmodules = sys.modules",
                    "syspath = sys.path",
                    "",
                    "sys.modules['test_module'] = 'bar'",
                    "sys.path.append('/some/path')",
                ]
            ),
        )
        self.fs.create_symlink("/folder2/symlink.py", "/folder/script.py")

    #
    # __name__
    #
    def test__name__is_set_correctly(self):
        module = exec_script("script.py", "/folder")

        assert module.__name__ == "__main__"

    #
    # __file__
    #
    def test_relative_path_works_and__file__shows_it(self):
        module = exec_script("script.py", "/folder")

        assert module.__file__ == "script.py"

    def test_absolute_path_works_and__file__shows_it(self):
        module = exec_script("/folder/script.py", "/folder")

        assert module.__file__ == "/folder/script.py"

    def test__file__doesnt_follow_symlink(self):
        module = exec_script("symlink.py", "/folder2")

        assert module.__file__ == "symlink.py"

    #
    # working dir
    #
    def test_working_directory_is_set_and_reset_correctly(self):
        os.chdir("/")

        module = exec_script("/folder/script.py", "/folder")

        assert module.cwd == "/folder"
        assert os.getcwd() == "/"

    #
    # sys.modules
    #
    def test__main__module_is_set_correctly(self):
        module = exec_script("/folder/script.py", "/folder")

        assert module.sysmodules["__main__"] == module

    def test_script_cannot_modify_our_sys_modules(self):
        sysmodules_backup = sys.modules.copy()

        exec_script("/folder/script.py", "/folder")

        assert sys.modules == sysmodules_backup

    #
    # sys.path
    #
    def test_script_cannot_modify_our_sys_path(self):
        syspath_backup = sys.path.copy()

        exec_script("/folder/script.py", "/folder")

        assert sys.path == syspath_backup

    def test_sys_path_is_set_up_correctly(self):
        syspath_backup = sys.path[:]
        module = exec_script("/folder/script.py", "/folder")

        assert module.syspath[0] == "/folder"
        assert module.syspath[1:] == syspath_backup[1:] + ["/some/path"]

    def test_symlink_is_followed_before_adding_base_dir_to_sys_path(self):
        module = exec_script("symlink.py", "/folder2")

        assert module.syspath[0] == "/folder"


if __name__ == "__main__":
    unittest.main()

关于python - 在焕然一新的 Python 环境中以编程方式从 Python 内部执行 Python 文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64068369/

相关文章:

python - Pandas 对象索引

Python - 展开字典

python - 使用 beautifulsoup 提取 url 和标题

python - import statsmodels.api as sm 出现意外错误

python - 从 Pytest fixture 中调用自定义函数

python - Python 中 "import as"的用例

python - 在加载包时导入包的子模块

python - 如何使用 importlib 实现可以动态修改源代码的导入 Hook ?

python - 在 Python 中动态更改导入的引用