algorithm - 找到曲线上的最佳权衡点

标签 algorithm matlab data-modeling model-fitting

假设我有一些数据,我想为其拟合一个参数化模型。我的目标是找到这个模型参数的最佳值。

我正在使用 AIC 进行模型选择/BIC/MDL奖励低误差模型但也惩罚高复杂性模型的标准类型(我们正在为这些数据寻求最简单但最有说服力的解释,可以这么说,la Occam's razor)。

根据上述内容,这是我根据三个不同的标准(两个要最小化,一个要最大化)得到的那种东西的示例:

aic-bic fit

在视觉上您可以很容易地看到弯头的形状,您可以为该区域某处的参数选择一个值。 问题是我正在为大量实验执行此操作,我需要一种无需干预即可找到此值的方法。

我的第一直觉是尝试从拐角处以 45 度角画一条线并继续移动它直到它与曲线相交,但这说起来容易做起来难 :) 如果曲线是,它也可能会错过感兴趣的区域有点歪斜。

关于如何实现这个的任何想法或更好的想法?

这里是重现上面其中一个情节所需的示例:

curve = [8.4663 8.3457 5.4507 5.3275 4.8305 4.7895 4.6889 4.6833 4.6819 4.6542 4.6501 4.6287 4.6162 4.585 4.5535 4.5134 4.474 4.4089 4.3797 4.3494 4.3268 4.3218 4.3206 4.3206 4.3203 4.2975 4.2864 4.2821 4.2544 4.2288 4.2281 4.2265 4.2226 4.2206 4.2146 4.2144 4.2114 4.1923 4.19 4.1894 4.1785 4.178 4.1694 4.1694 4.1694 4.1556 4.1498 4.1498 4.1357 4.1222 4.1222 4.1217 4.1192 4.1178 4.1139 4.1135 4.1125 4.1035 4.1025 4.1023 4.0971 4.0969 4.0915 4.0915 4.0914 4.0836 4.0804 4.0803 4.0722 4.065 4.065 4.0649 4.0644 4.0637 4.0616 4.0616 4.061 4.0572 4.0563 4.056 4.0545 4.0545 4.0522 4.0519 4.0514 4.0484 4.0467 4.0463 4.0422 4.0392 4.0388 4.0385 4.0385 4.0383 4.038 4.0379 4.0375 4.0364 4.0353 4.0344];
plot(1:100, curve)

编辑

我接受了Jonas给出的解决方案.基本上,对于曲线上的每个点 p,我们找到具有最大距离 d 的点:

point-line-distance

最佳答案

找到弯头的一种快速方法是从曲线的第一个点到最后一个点画一条线,然后找到距离该线最远的数据点。

这当然在一定程度上取决于直线平坦部分的点数,但如果每次测试相同数量的参数,结果应该还算不错。

curve = [8.4663 8.3457 5.4507 5.3275 4.8305 4.7895 4.6889 4.6833 4.6819 4.6542 4.6501 4.6287 4.6162 4.585 4.5535 4.5134 4.474 4.4089 4.3797 4.3494 4.3268 4.3218 4.3206 4.3206 4.3203 4.2975 4.2864 4.2821 4.2544 4.2288 4.2281 4.2265 4.2226 4.2206 4.2146 4.2144 4.2114 4.1923 4.19 4.1894 4.1785 4.178 4.1694 4.1694 4.1694 4.1556 4.1498 4.1498 4.1357 4.1222 4.1222 4.1217 4.1192 4.1178 4.1139 4.1135 4.1125 4.1035 4.1025 4.1023 4.0971 4.0969 4.0915 4.0915 4.0914 4.0836 4.0804 4.0803 4.0722 4.065 4.065 4.0649 4.0644 4.0637 4.0616 4.0616 4.061 4.0572 4.0563 4.056 4.0545 4.0545 4.0522 4.0519 4.0514 4.0484 4.0467 4.0463 4.0422 4.0392 4.0388 4.0385 4.0385 4.0383 4.038 4.0379 4.0375 4.0364 4.0353 4.0344];

%# get coordinates of all the points
nPoints = length(curve);
allCoord = [1:nPoints;curve]';              %'# SO formatting

%# pull out first point
firstPoint = allCoord(1,:);

%# get vector between first and last point - this is the line
lineVec = allCoord(end,:) - firstPoint;

%# normalize the line vector
lineVecN = lineVec / sqrt(sum(lineVec.^2));

%# find the distance from each point to the line:
%# vector between all points and first point
vecFromFirst = bsxfun(@minus, allCoord, firstPoint);

%# To calculate the distance to the line, we split vecFromFirst into two 
%# components, one that is parallel to the line and one that is perpendicular 
%# Then, we take the norm of the part that is perpendicular to the line and 
%# get the distance.
%# We find the vector parallel to the line by projecting vecFromFirst onto 
%# the line. The perpendicular vector is vecFromFirst - vecFromFirstParallel
%# We project vecFromFirst by taking the scalar product of the vector with 
%# the unit vector that points in the direction of the line (this gives us 
%# the length of the projection of vecFromFirst onto the line). If we 
%# multiply the scalar product by the unit vector, we have vecFromFirstParallel
scalarProduct = dot(vecFromFirst, repmat(lineVecN,nPoints,1), 2);
vecFromFirstParallel = scalarProduct * lineVecN;
vecToLine = vecFromFirst - vecFromFirstParallel;

%# distance to line is the norm of vecToLine
distToLine = sqrt(sum(vecToLine.^2,2));

%# plot the distance to the line
figure('Name','distance from curve to line'), plot(distToLine)

%# now all you need is to find the maximum
[maxDist,idxOfBestPoint] = max(distToLine);

%# plot
figure, plot(curve)
hold on
plot(allCoord(idxOfBestPoint,1), allCoord(idxOfBestPoint,2), 'or')

关于algorithm - 找到曲线上的最佳权衡点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2018178/

相关文章:

java - JPA - 如何映射到快速查找表?

algorithm - 动态规划获取最大钻石

html - 使用 alpha channel 绘制重叠的圆圈

MATLAB:如何获取枚举的所有项目的数组?

matlab - 从matlab中的单元格数组中删除包含零的行

powerbi - Power BI::这是有效的星型架构吗?

c++ - 使用 par_unseq 时,我仍然可以依赖输出元素的顺序吗?

algorithm - 在树数据结构中查找所有叶节点的最高效方法

matlab - 5 个独立分布的拉丁超立方抽样

mysql - 我什么时候应该关心数据建模?