python - 3个嵌套循环: Optimizing a simple simulation for speed

背景

我遇到了一个难题。这是:

One day, an alien comes to Earth. Every day, each alien does one of four things, each with equal probability to:

Kill himself

Do nothing

Split himself into two aliens (while killing himself)

split himself into three aliens (while killing himself)

What is the probability that the alien species eventually dies out entirely?

Link to the source and the solution, problem #10

不幸的是，我没能从理论上解决这个问题。然后我继续使用基本的马尔可夫链和蒙特卡罗模拟来模拟它。

面试时我没有问过这个问题。我从 friend 那里了解到这个问题，然后在寻找数学解决方案时找到了上面的链接。

重新解释问题

我们从外星人的数量开始n = 1。 n 有机会不改变、减少 1、增加 1、减少 2 >, 每个 %25。如果n增加，即外星人成倍增加，我们会再次重复此过程n次。这对应着每个外星人都会再次做它的事情。不过，我必须设置一个上限，以便我们停止模拟并避免崩溃。 n 可能会增加，我们会一次又一次地循环 n 次。

如果外星人以某种方式灭绝，我们将再次停止模拟，因为没有什么可以模拟的了。

在n达到零或上限后，我们还记录总体(它将为零或某个数字>= max_pop)。

我重复了很多次并记录了每个结果。最后，零的数量除以结果总数应该给我一个近似值。

代码

from random import randint
import numpy as np

pop_max = 100
iter_max = 100000

results = np.zeros(iter_max, dtype=int)

for i in range(iter_max):
    n = 1
    while n > 0 and n < pop_max:
        for j in range(n):
            x = randint(1, 4)
            if x == 1:
                n = n - 1
            elif x == 2:
                continue
            elif x == 3:
                n = n + 1
            elif x == 4:
                n = n + 2
    results[i] = n

print( np.bincount(results)[0] / iter_max )

iter_max和pop_max确实可以改变，但我想如果有100个外星人，他们灭绝的概率会低得可以忽略不计。但这只是一个猜测，我没有做任何事情来计算(更)适当的人口上限。

这段代码给出了有希望的结果，相当接近真实的答案，大约是 %41.4。

一些输出

> python aliens.py
0.41393
> python aliens.py
0.41808
> python aliens.py
0.41574
> python aliens.py
0.4149
> python aliens.py
0.41505
> python aliens.py
0.41277
> python aliens.py
0.41428
> python aliens.py
0.41407
> python aliens.py
0.41676

后果

我对结果感到满意，但对于这段代码所花费的时间，我不能说同样的事情。大约需要 16-17 秒:)

如何提高速度？如何优化循环(尤其是 while 循环)？也许有更好的方法或更好的模型？

最佳答案

您可以通过使用 numpy 一次性生成 n 个随机整数(更快)来矢量化您的内部循环，并使用算术而不是 bool 逻辑来消除所有 if 语句。

while...: 
    #population changes by (-1, 0, +1, +2) for each alien
    n += np.random.randint(-1,3, size=n).sum()

使用您的确切代码进行其他操作(您可能可以在其他地方找到其他优化)，通过这一更改，我将时间从 21.2 秒缩短到了 4.3 秒。

在不更改算法的情况下(即使用蒙特卡罗以外的方法求解)，我看不到任何其他彻底的更改可以使其更快，直到您开始编译为机器代码(幸运的是，如果您有的话，这很容易)已安装 numba)。

我不会提供有关 numba 执行的即时编译的完整教程，但我只会分享我的代码并记下我所做的更改:

from time import time
import numpy as np
from numpy.random import randint
from numba import njit, int32, prange

@njit('i4(i4)')
def simulate(pop_max): #move simulation of one population to a function for parallelization
    n = 1
    while 0 < n < pop_max:
        n += np.sum(randint(-1,3,n))
    return n

@njit('i4[:](i4,i4)', parallel=True)
def solve(pop_max, iter_max):
    #this could be easily simplified to just return the raio of populations that die off vs survive to pop_max
    # which would save you some ram (though the speed is about the same)
    results = np.zeros(iter_max, dtype=int32) #numba needs int32 here rather than python int
    for i in prange(iter_max): #prange specifies that this loop can be parallelized
        results[i] = simulate(pop_max)
    return results

pop_max = 100
iter_max = 100000

t = time()
print( np.bincount(solve(pop_max, iter_max))[0] / iter_max )
print('time elapsed: ', time()-t)

在我的系统上，并行化编译可将计算速度降低至约 0.15 秒。

关于python - 3个嵌套循环: Optimizing a simple simulation for speed，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53708231/

python - 3个嵌套循环: Optimizing a simple simulation for speed

背景

重新解释问题

代码

一些输出

后果

上一篇：python - 如何在其他函数的曲线上画摆线(其他摆线)？

下一篇：python - 分组数据框的 Seaborn 群图