python - 关于蒙特卡罗概率语法

让 20 个人(包括 3 名女性)随机坐在 4 张 table 上(表示为 (A,B,C,D))，每张 table 有 5 人，所有安排的可能性均等。设 X 为没有女性坐在的 table 的数量。编写一个 numpy 蒙特卡洛模拟来估计 X 的期望，并估计没有女性坐在 A table 的概率 p。针对 3 个案例 (100,1000,10000) 运行模拟

我想定义一个函数，利用 numpy 的 random.permutation 函数来计算 X 的期望值，给定一定数量的试验我知道如何在笔和纸上做到这一点，迭代我的概率集合并相乘它们相互关联，以便我可以计算事件的总概率。这就是我到目前为止所拥有的

T = 4       # number of tables
N = 20      # number of persons. Assumption: N is a multiple of T.
K = 5       # capacity per table
W = 3       # number of women. Assumption: first W of N persons are women.
M =100      #number of trials

collection = []

for i in range(K):


    x = (((N-W)-i)/(N-i))

    collection.append(x)

如果我检查我的集合，这是我的输出:[0.85, 0.8421052631578947, 0.8333333333333334, 0.8235294117647058, 0.8125]

最佳答案

实现

这是蒙特卡洛模拟的简单实现。它并不是为了高性能而设计的，而是允许您交叉检查设置并查看详细信息:

import collections
import numpy as np

def runMonteCarlo(nw=3, nh=20, nt=4, N=20):
    """
    Run Monte Carlo Simulation
    """

    def countWomen(c, nt=4):
        """
        Count Number of Women per Table
        """
        x = np.array(c).reshape(nt, -1).T  # Split permutation into tables
        return np.sum(x, axis=0)           # Sum woman per table

    # Initialization:
    comp = np.array([1]*nw + [0]*(nh-nw)) # Composition: 1=woman, 0=man
    x = []                                # Counts of tables without any woman
    p = 0                                 # Probability of there is no woman at table A  

    for k in range(N):
        c = np.random.permutation(comp)   # Random permutation, table composition
        w = countWomen(c, nt=nt)          # Count Woman per table
        nc = np.sum(w!=0)                 # Count how many tables with women 
        x.append(nt - nc)                 # Store count of tables without any woman
        p += int(w[0]==0)                 # Is table A empty?
        #if k % 100 == 0:
            #print(c, w, nc, nt-nc, p)

    # Rationalize (count->frequency)
    r = collections.Counter(x)
    r = {k:r.get(k, 0)/N for k in range(nt+1)}
    p /= N
    return r, p

执行工作:

for n in [100, 1000, 10000]:
    s = runMonteCarlo(N=n)
    E = sum([k*v for k,v in s[0].items()])
    print('N=%d, P(X=k) = %s, p=%s, E[X]=%s' % (n, *s, E))

N=100, P(X=k) = {0: 0.0, 1: 0.43, 2: 0.54, 3: 0.03, 4: 0.0}, p=0.38, E[X]=1.6
N=1000, P(X=k) = {0: 0.0, 1: 0.428, 2: 0.543, 3: 0.029, 4: 0.0}, p=0.376, E[X]=1.601
N=10000, P(X=k) = {0: 0.0, 1: 0.442, 2: 0.5235, 3: 0.0345, 4: 0.0}, p=0.4011, E[X]=1.5924999999999998

绘制分布图，可得出:

import pandas as pd
axe = pd.DataFrame.from_dict(s[0], orient='index').plot(kind='bar')
axe.set_title("Monte Carlo Simulation")
axe.set_xlabel('Random Variable, $X$')
axe.set_ylabel('Frequency, $F(X=k)$')
axe.grid()

与替代版本的分歧

Caution: this method does not answer the stated problem!

如果我们实现另一个版本的模拟，其中我们更改随机实验的执行方式，如下所示:

import random
import collections

def runMonteCarlo2(nw=3, nh=20, nt=4, N=20):
    """
    Run Monte Carlo Simulation
    """

    def one_experiment(nt, nw):
        """
        Table setup (suggested by @Inon Peled)
        """
        return set(random.randint(0, nt-1) for _ in range(nw)) # Sample nw times from 0 <= k <= nt-1

    c = collections.Counter()             # Empty Table counter
    p = 0                                 # Probability of there is no woman at table A  

    for k in range(N):
        exp = one_experiment(nt, nw)      # Select table with at least one woman
        c.update([nt - len(exp)])         # Update Counter X distribution
        p += int(0 not in exp)            # There is no woman at table A (table 0)

    # Rationalize:
    r = {k:c.get(k, 0)/N for k in range(nt+1)}
    p /= N

    return r, p

它返回:

N=100, P(X=k) = {0: 0.0, 1: 0.41, 2: 0.51, 3: 0.08, 4: 0.0}, p=0.4, E[X]=1.67
N=1000, P(X=k) = {0: 0.0, 1: 0.366, 2: 0.577, 3: 0.057, 4: 0.0}, p=0.426, E[X]=1.691
N=1000000, P(X=k) = {0: 0.0, 1: 0.37462, 2: 0.562787, 3: 0.062593, 4: 0.0}, p=0.42231, E[X]=1.687973

第二个版本向另一个值(value)观收敛，它显然不等同于第一个版本，它不回答相同的问题。

讨论

为了区分哪个实现是正确的，我有 computed sampled spaces and probabilities对于这两种实现。看来第一个版本是正确的，因为它考虑到女性坐在餐 table 旁的概率取决于之前被选择的人。第二个版本没有考虑到这一点，这就是为什么它不需要知道有多少人以及每张 table 可以坐多少人。

这是一个很好的问题，因为两个答案都提供了接近的结果。这项工作的一个重要部分是妥善设置蒙特卡洛输入。

关于python - 关于蒙特卡罗概率语法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55255633/

python - 关于蒙特卡罗概率语法

实现

与替代版本的分歧

讨论

上一篇：python - 正则表达式在多种条件下匹配版权声明中的公司名称

下一篇：python - 如何在 pandas 中优化此类搜索和替换