python - 直接调用和分配给变量之间的速度差异

情况:考虑以下两个 Python 代码片段:-

代码1:

for root, dirs, files in os.walk(top):
    for f in files:
        path = os.path.join(root, f)
        print(path)

代码2:

for root, dirs, files in os.walk(top):
    for f in files:
        print(os.path,join(root,f))

问题:如果我不将文件路径声明为变量，性能或速度方面是否会有任何差异(假设我只会使用它一次 - 如果使用多次声明变量更有意义)

最佳答案

除了使用 timeit对于简单的基准测试，您可以pytest-benchmark，这使得创建比较变得非常简单，只需:

import os

def f1(top):
    for root, dirs, files in os.walk(top):
        for f in files:
            path = os.path.join(root, f)
            print(path)

def f2(top):
    for root, dirs, files in os.walk(top):
        for f in files:
            print(os.path.join(root, f))

def test_f1(benchmark):
    benchmark(f1, '~/tmp')

def test_f2(benchmark):
    benchmark(f2, '~/tmp')

注意:~/tmp 包含 350 个文件/文件夹，YMMV。运行

python -m pytest test.py --benchmark-min-time=0.001 --benchmark-histogram=hist

为您提供良好的数据和直方图:

----------------------------------------------------------------------- benchmark: 2 tests ----------------------------------------------------------------------
Name (time in us)        Min               Max              Mean            StdDev            Median               IQR            Outliers(*)  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
test_f1               4.4811 (1.0)      8.6253 (1.0)      4.7941 (1.00)     0.3531 (1.0)      4.7141 (1.01)     0.2762 (1.31)            15;7     216        1000
test_f2               4.4967 (1.00)     9.3009 (1.08)     4.7773 (1.0)      0.5242 (1.48)     4.6838 (1.0)      0.2113 (1.0)             6;13     215        1000
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

如您所见，考虑到高方差，差异并不显着。

现在，如果您仍然好奇，可以使用 dis显示 CPython 正在执行的字节码。这是 CPython 解释器的功能，这是运行 python 代码的最常见方式:

In [1]: import os, dis

In [2]: def f1(top):
   ...:     for root, dirs, files in os.walk(top):
   ...:         for f in files:
   ...:             path = os.path.join(root, f)
   ...:             print(path)
   ...:             

In [3]: def f2(top):
   ...:     for root, dirs, files, in os.walk(top):
   ...:         for f in files:
   ...:             print(os.path.join(root, f))
   ...:             

In [4]: dis.dis(f1)
  2           0 SETUP_LOOP              60 (to 62)
              2 LOAD_GLOBAL              0 (os)
              4 LOAD_ATTR                1 (walk)
              6 LOAD_FAST                0 (top)
              8 CALL_FUNCTION            1
             10 GET_ITER
        >>   12 FOR_ITER                46 (to 60)
             14 UNPACK_SEQUENCE          3
             16 STORE_FAST               1 (root)
             18 STORE_FAST               2 (dirs)
             20 STORE_FAST               3 (files)

  3          22 SETUP_LOOP              34 (to 58)
             24 LOAD_FAST                3 (files)
             26 GET_ITER
        >>   28 FOR_ITER                26 (to 56)
             30 STORE_FAST               4 (f)

  4          32 LOAD_GLOBAL              0 (os)
             34 LOAD_ATTR                2 (path)
             36 LOAD_ATTR                3 (join)
             38 LOAD_FAST                1 (root)
             40 LOAD_FAST                4 (f)
             42 CALL_FUNCTION            2
             44 STORE_FAST               5 (path)

  5          46 LOAD_GLOBAL              4 (print)
             48 LOAD_FAST                5 (path)
             50 CALL_FUNCTION            1
             52 POP_TOP
             54 JUMP_ABSOLUTE           28
        >>   56 POP_BLOCK
        >>   58 JUMP_ABSOLUTE           12
        >>   60 POP_BLOCK
        >>   62 LOAD_CONST               0 (None)
             64 RETURN_VALUE

In [5]: dis.dis(f2)
  2           0 SETUP_LOOP              56 (to 58)
              2 LOAD_GLOBAL              0 (os)
              4 LOAD_ATTR                1 (walk)
              6 LOAD_FAST                0 (top)
              8 CALL_FUNCTION            1
             10 GET_ITER
        >>   12 FOR_ITER                42 (to 56)
             14 UNPACK_SEQUENCE          3
             16 STORE_FAST               1 (root)
             18 STORE_FAST               2 (dirs)
             20 STORE_FAST               3 (files)

  3          22 SETUP_LOOP              30 (to 54)
             24 LOAD_FAST                3 (files)
             26 GET_ITER
        >>   28 FOR_ITER                22 (to 52)
             30 STORE_FAST               4 (f)

  4          32 LOAD_GLOBAL              2 (print)
             34 LOAD_GLOBAL              0 (os)
             36 LOAD_ATTR                3 (path)
             38 LOAD_ATTR                4 (join)
             40 LOAD_FAST                1 (root)
             42 LOAD_FAST                4 (f)
             44 CALL_FUNCTION            2
             46 CALL_FUNCTION            1
             48 POP_TOP
             50 JUMP_ABSOLUTE           28
        >>   52 POP_BLOCK
        >>   54 JUMP_ABSOLUTE           12
        >>   56 POP_BLOCK
        >>   58 LOAD_CONST               0 (None)
             60 RETURN_VALUE

所以第一个代码确实产生了更多的字节码指令。

无论如何，你应该考虑profiling - 确保您查看真正相关的代码部分，并避免盲目优化。

关于python - 直接调用和分配给变量之间的速度差异，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/44104630/

python - 直接调用和分配给变量之间的速度差异

上一篇：python - 如何在项目中正确使用辅助模块？

下一篇：python - 管理 'for' 循环中特定类的列表 - Python