python - 慢 scipy 双正交积分

我正在尝试获取作为积分结果的函数 expected_W 或 H:

$H(p, \theta_0, \theta_1) = \int_{-\infty}^\infty \int_{-\infty}^\infty w(p, \theta, \epsilon, \beta) f(\beta | \theta) q(\epsilon) \; d \beta \; d \epsilon$

哪里:

theta 是一个包含两个元素的向量:theta_0 和 theta_1
f(beta | theta) 是 beta 的正态密度，均值为 theta_0，方差为 theta_1
q(epsilon) 是 epsilon 的正态密度，均值为零，方差为 sigma_epsilon(默认设置为 1)。
w(p, theta, eps, beta) 是我作为输入的函数，因此我无法准确预测它的外观。它可能是非线性的，但不是特别讨厌。

这就是我解决问题的方式。我确信我制作的包装函数一团糟，所以我也很乐意收到任何帮助。

from __future__ import division
from scipy import integrate
from scipy.stats import norm
import math
import numpy as np


def exp_w(w_B, sigma_eps = 1, **kwargs):
    '''
    Integrates the w_B function

    Input:
    + w_B : the function to be integrated. 
    + sigma_eps : variance of the epsilon term. Set to 1 by default
    '''

    #The integrand function gives everything under the integral:
    # w(B(p, \theta, \epsilon, \beta)) f(\beta | \theta ) q(\epsilon)
    def integrand(eps, beta, p, theta_0, theta_1, sigma_eps=sigma_eps):
        q_e = norm.pdf(eps, loc=0, scale=math.sqrt(sigma_eps))
        f_beta = norm.pdf(beta, loc=theta_0, scale=math.sqrt(theta_1))

        return w_B(p = p, 
                   theta_0 = theta_0, theta_1 = theta_1,
                   eps = eps, beta=beta)* q_e *f_beta

    #limits of integration. Using limited support for now.
    eps_inf = lambda beta : -10 # otherwise: -np.inf
    eps_sup = lambda beta : 10  # otherwise: np.inf
    beta_inf = -10
    beta_sup = 10

    def integrated_f(p, theta_0, theta_1):
        return integrate.dblquad(integrand, beta_inf, beta_sup,
            eps_inf, eps_sup,
            args = (p, theta_0, theta_1))
    # this integrated_f is the H referenced at the top of the question
    return integrated_f

我用一个简单的 w 函数测试了这个函数，我知道它的解析解(通常情况不会这样)。

def test_exp_w():
    def w_B(p, theta_0, theta_1, eps, beta):
        return 3*(p*eps + p*(theta_0 + theta_1) - beta)

    # Function that I get
    integrated = exp_w(w_B, sigma_eps = 1)

    # Function that I should get
    def exp_result(p, theta_0, theta_1):
        return 3*p*(theta_0 + theta_1) - 3*theta_0

    args = np.random.rand(3)
    d_args = {'p' : args[0], 'theta_0' : args[1], 'theta_1' : args[2]}

    if not (np.allclose(
    integrated(**d_args)[0], exp_result(**d_args)) ):
        raise Exception("Integration procedure isn't working!")

因此，我的实现似乎是有效的，但对于我的目的而言它非常慢。我需要重复此过程数万次或数十万次(这是值(value)函数迭代中的一个步骤。如果人们认为相关，我可以提供更多信息)。

使用 scipy 0.14.0 版和 numpy 1.8.1 版，计算这个积分需要 15 秒。

有人对如何解决这个问题有什么建议吗？首先，tt 可能有助于获得有界的积分域，但我还没有弄清楚如何做到这一点，或者 SciPy 中的高斯正交是否以一种好的方式处理它(它使用 Gauss-Hermite 吗？) .

感谢您的宝贵时间。

---- 编辑:添加分析时间-----

%lprun 结果表明大部分时间花在 _distn_infraestructure.py:1529(pdf) 和 _continuous_distns.py:97(_norm_pdf) 每个都有高达 83244 个电话号码。

最佳答案

如果函数不是一个令人讨厌的函数，那么集成函数所花费的时间听起来会很长。

我建议您做的第一件事是分析时间花在了哪里。是花在 dblquad 还是其他地方？在集成期间对 w_B 进行了多少次调用？如果时间花在 dblquad 上并且调用次数非常多，您可以在集成中使用更宽松的容差吗？

似乎高斯的乘法实际上使您能够大大限制积分限制，因为高斯的大部分能量都在非常小的区域内。您可能想尝试计算合理的更严格的界限。您已经将区域限制为-10..10； -100..100、-10..10 和 -1..1 之间是否存在任何显着的性能变化？

如果你知道你的功能比较流畅，那么有一个米老鼠版的集成:

确定两个轴的合理上限和下限(通过高斯分布)
计算合理的网格密度(例如每个方向100个点)
为每个点计算 w_B(如果可能需要 w_B 的矢量化版本，这会更快)
总结一下

这是非常低技术含量但也非常快。它是否为您提供了足以进行外部迭代的结果是一个有趣的问题。它只是可能。

关于python - 慢 scipy 双正交积分，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/25253020/

python - 慢 scipy 双正交积分

上一篇：python - 将实例属性从一个模块移动到另一个模块

下一篇：python - 我们可以限制与 testtools.ConcurrentStreamTestSuite 并行运行的测试数量吗