Python Curve_Fit 指数/幂/对数曲线 - 改善结果

标签 python optimization scipy logistic-regression curve-fitting

我正在尝试拟合这个渐近接近零(但从未达到它)的数据。

我相信最好的曲线是逆逻辑函数,但欢迎建议。关键是预期的衰减“S 曲线”形状。

这是我到目前为止的代码,以及下面的绘图图像,这是一个非常丑陋的适合。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# DATA

x = pd.Series([1,1,264,882,913,1095,1156,1217,1234,1261,1278,1460,1490,1490,1521,1578,1612,1612,1668,1702,1704,1735,1793,2024,2039,2313,2313,2558,2558,2617,2617,2708,2739,2770,2770,2831,2861,2892,2892,2892,2892,2892,2923,2923,2951,2951,2982,2982,3012,3012,3012,3012,3012,3012,3012,3073,3073,3073,3104,3104,3104,3104,3135,3135,3135,3135,3165,3165,3165,3165,3165,3196,3196,3196,3226,3226,3257,3316,3347,3347,3347,3347,3377,3377,3438,3469,3469]).values
y = pd.Series([1000,600,558.659217877095,400,300,100,7.75,6,8.54,6.66666666666667,7.14,1.1001100110011,1.12,0.89,1,2,0.666666666666667,0.77,1.12612612612613,0.7,0.664010624169987,0.65,0.51,0.445037828215398,0.27,0.1,0.26,0.1,0.1,0.13,0.16,0.1,0.13,0.1,0.12,0.1,0.13,0.14,0.14,0.17,0.11,0.15,0.09,0.1,0.26,0.16,0.09,0.09,0.05,0.09,0.09,0.1,0.1,0.11,0.11,0.09,0.09,0.11,0.08,0.09,0.09,0.1,0.06,0.07,0.07,0.09,0.05,0.05,0.06,0.07,0.08,0.08,0.07,0.1,0.08,0.08,0.05,0.06,0.04,0.04,0.05,0.05,0.04,0.06,0.05,0.05,0.06]).values

# Inverse Logistic Function 
# https://en.wikipedia.org/wiki/Logistic_function
def func(x, L ,x0, k, b):
    y = 1/(L / (1 + np.exp(-k*(x-x0)))+b)
    return y

# FIT DATA

p0 = [max(y), np.median(x),1,min(y)] # this is an mandatory initial guess
popt, pcov = curve_fit(func, x, y,p0, method='dogbox',maxfev=10000)

# PERFORMANCE

modelPredictions = func(x, *popt)
absError = modelPredictions - y
SE = np.square(absError) # squared errors
MSE = np.mean(SE) # mean squared errors
RMSE = np.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (np.var(absError) / np.var(y))

print('Parameters:', popt)
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

#PLOT

plt.figure()
plt.plot(x, y, 'ko', label="Original Noised Data")
plt.plot(x, func(x, *popt), 'r-', label="Fitted Curve")
plt.legend()
plt.yscale('log')
#plt.xscale('log')
plt.show()

这是运行此代码时的结果...以及我想要实现的目标!

enter image description here

如何更好地优化 curve_fit,以便我得到更接近蓝色绘制线的东西,而不是代码生成的红色线?

谢谢!!

最佳答案

根据您的数据图和预期拟合,我猜测您并不真正希望将数据 y 建模为类似逻辑的阶跃函数,而是 log(y ) 作为类似逻辑的阶跃函数。

所以,我认为您可能想要使用逻辑阶跃函数,也许添加一个线性组件来对该数据的日志进行建模。我会使用 lmfit 来完成此操作,因为它带有内置模型,可以提供更好的结果报告,并允许您大大简化拟合代码(免责声明:我是主要作者) :

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

from lmfit.models import StepModel, LinearModel

# DATA
x = pd.Series([1, 1, 264, 882, 913, 1095, 1156, 1217, 1234, 1261, 1278,
              1460, 1490, 1490, 1521, 1578, 1612, 1612, 1668, 1702, 1704,
              1735, 1793, 2024, 2039, 2313, 2313, 2558, 2558, 2617, 2617,
              2708, 2739, 2770, 2770, 2831, 2861, 2892, 2892, 2892, 2892,
              2892, 2923, 2923, 2951, 2951, 2982, 2982, 3012, 3012, 3012,
              3012, 3012, 3012, 3012, 3073, 3073, 3073, 3104, 3104, 3104,
              3104, 3135, 3135, 3135, 3135, 3165, 3165, 3165, 3165, 3165,
              3196, 3196, 3196, 3226, 3226, 3257, 3316, 3347, 3347, 3347,
              3347, 3377, 3377, 3438, 3469, 3469]).values

y = pd.Series([1000, 600, 558.659217877095, 400, 300, 100, 7.75, 6, 8.54,
              6.66666666666667, 7.14, 1.1001100110011, 1.12, 0.89, 1, 2,
              0.666666666666667, 0.77, 1.12612612612613, 0.7,
              0.664010624169987, 0.65, 0.51, 0.445037828215398, 0.27, 0.1,
              0.26, 0.1, 0.1, 0.13, 0.16, 0.1, 0.13, 0.1, 0.12, 0.1, 0.13,
              0.14, 0.14, 0.17, 0.11, 0.15, 0.09, 0.1, 0.26, 0.16, 0.09,
              0.09, 0.05, 0.09, 0.09, 0.1, 0.1, 0.11, 0.11, 0.09, 0.09,
              0.11, 0.08, 0.09, 0.09, 0.1, 0.06, 0.07, 0.07, 0.09, 0.05,
              0.05, 0.06, 0.07, 0.08, 0.08, 0.07, 0.1, 0.08, 0.08, 0.05,
              0.06, 0.04, 0.04, 0.05, 0.05, 0.04, 0.06, 0.05, 0.05, 0.06]).values

model = StepModel(form='logistic') + LinearModel()
params = model.make_params(amplitude=-5, center=1000, sigma=100, intercept=0, slope=0)

result = model.fit(np.log(y), params, x=x)

print(result.fit_report())

plt.plot(x, y, 'ko', label="Original Noised Data")
plt.plot(x, np.exp(result.best_fit), 'r-', label="Fitted Curve")
plt.legend()
plt.yscale('log')
plt.show()

这将打印出一份报告,其中包含拟合统计数据和最佳拟合值:

[[Model]]
    (Model(step, form='logistic') + Model(linear))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 73
    # data points      = 87
    # variables        = 5
    chi-square         = 9.38961801
    reduced chi-square = 0.11450754
    Akaike info crit   = -183.688405
    Bayesian info crit = -171.358865
[[Variables]]
    amplitude: -4.89008796 +/- 0.29600969 (6.05%) (init = -5)
    center:     1180.65823 +/- 15.2836422 (1.29%) (init = 1000)
    sigma:      94.0317580 +/- 18.5328976 (19.71%) (init = 100)
    slope:     -0.00147861 +/- 8.1151e-05 (5.49%) (init = 0)
    intercept:  6.95177838 +/- 0.17170849 (2.47%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(amplitude, slope)     = -0.798
    C(amplitude, sigma)     = -0.649
    C(amplitude, intercept) = -0.605
    C(center, intercept)    = -0.574
    C(sigma, slope)         =  0.542
    C(sigma, intercept)     =  0.348
    C(center, sigma)        = -0.335
    C(amplitude, center)    =  0.282

并生成这样的图

enter image description here

如果您愿意,您当然可以使用 scipy.optimize.curve_fit 重现所有内容,但我会将其作为练习。

关于Python Curve_Fit 指数/幂/对数曲线 - 改善结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59282936/

相关文章:

python - 最快的截图库python/提高mss包的性能

python - 将字符串与列表条目合并 - 交替

python - 为什么三元条件不能完美地用于字符串连接

c++ - 如何用Ceres解决大规模非线性优化问题?

python数据分析,cookbook代码难懂

python - Python : bound, 未绑定(bind)和静态中的类方法差异

optimization - GNU内联汇编优化

php - 检查 SESSION 变量的值是否属于业务逻辑?

python - 在 Pandas 中使用不同的指标(倾斜)每小时聚合数据

python - Numpy转置乘法问题