python 数据自动重采样

标签 python python-3.x numpy

假设我有以下数据(测量值):

enter image description here

正如您所看到的,有很多尖点(即斜率变化很大的地方)。因此,最好围绕这些点进行更多测量。为此,我编写了一个脚本:

  1. 我计算 3 个连续点的曲率: 门格尔曲率:https://en.wikipedia.org/wiki/Menger_curvature#Definition

  2. 然后我根据曲率决定应该重新采样哪些值。

...我迭代直到平均曲率下降...但它不起作用,因为它上升了。你知道为什么吗?

这是完整的代码(在 x 值的长度达到 60 后停止):

import numpy as np
import matplotlib.pyplot as plt

def curvature(A,B,C):
    """Calculates the Menger curvature fro three Points, given as numpy arrays.
    Sources:
    Menger curvature: https://en.wikipedia.org/wiki/Menger_curvature#Definition
    Area of a triangle given 3 points: https://math.stackexchange.com/questions/516219/finding-out-the-area-of-a-triangle-if-the-coordinates-of-the-three-vertices-are
    """

    # Pre-check: Making sure that the input points are all numpy arrays
    if any(x is not np.ndarray for x in [type(A),type(B),type(C)]):
        print("The input points need to be a numpy array, currently it is a ", type(A))

    # Augment Columns
    A_aug = np.append(A,1)
    B_aug = np.append(B,1)
    C_aug = np.append(C,1)

    # Caclulate Area of Triangle
    matrix = np.column_stack((A_aug,B_aug,C_aug))
    area = 1/2*np.linalg.det(matrix)

    # Special case: Two or more points are equal 
    if np.all(A == B) or  np.all(B == C):
        curvature = 0
    else:
        curvature = 4*area/(np.linalg.norm(A-B)*np.linalg.norm(B-C)*np.linalg.norm(C-A))

    # Return Menger curvature
    return curvature


def values_to_calulate(x,curvature_list, max_curvature):
    """Calculates the new x values which need to be calculated
    Middle point between the three points that were used to calculate the curvature """
    i = 0
    new_x = np.empty(0)
    for curvature in curvature_list:
        if curvature > max_curvature:
            new_x = np.append(new_x, x[i]+(x[i+2]-x[i])/3 )
        i = i+1
    return new_x


def plot(x,y, title, xLabel, yLabel):
    """Just to visualize"""

    # Plot
    plt.scatter(x,y)
    plt.plot(x, y, '-o')

    # Give a title for the sine wave plot
    plt.title(title)

    # Give x axis label for the sine wave plot
    plt.xlabel(xLabel)

    # Give y axis label for the sine wave plot
    plt.ylabel(yLabel)
    plt.grid(True, which='both')
    plt.axhline(y=0, color='k')


    # Display the sine wave
    plt.show
    plt.pause(0.05)

### STARTS HERE


# Get x values of the sine wave
x = np.arange(0, 10, 1);

# Amplitude of the sine wave is sine of a variable like time
def function(x):
    return 1+np.sin(x)*np.cos(x)**2
y = function(x)

# Plot it
plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')


continue_Loop = True

while continue_Loop == True :
    curvature_list = np.empty(0)
    for i in range(len(x)-2):
        # Get the three points
        A = np.array([x[i],y[i]])
        B = np.array([x[i+1],y[i+1]])
        C = np.array([x[i+2],y[i+2]])

        # Calculate the curvature
        curvature_value = abs(curvature(A,B,C))
        curvature_list = np.append(curvature_list, curvature_value)



    print("len: ", len(x) )
    print("average curvature: ", np.average(curvature_list))

    # Calculate the points that need to be added 
    x_new = values_to_calulate(x,curvature_list, max_curvature=0.3)

    # Add those values to the current x list:
    x = np.sort(np.append(x, x_new))

    # STOPED IT AFTER len(x) == 60
    if len(x) >= 60:
        continue_Loop = False

    # Amplitude of the sine wave is sine of a variable like time
    y = function(x)

    # Plot it
    plot(x,y, title='Data', xLabel='Time', yLabel='Amplitude')

它应该是这样的:

enter image description here

编辑:

如果你让它跑得更远......:

enter image description here

最佳答案

总结一下我上面的评论:

  • 您正在计算曲线的平均曲率,它没有理由变为 0。在每个点,无论您的点有多接近,圆半径都会收敛到该点的曲率,不是 0。

  • 另一种方法是使用两点之间的绝对导数变化:继续采样,直到 abs(d(df/dx)) < some_threshold哪里d(df/dx) = (df/dx)[n] - (df/dx)[n-1]

关于python 数据自动重采样,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55526575/

相关文章:

python - 从三个 numpy 数组生成图像数据

jquery - 自定义 Django 标签和 jQuery

python - lxml 中的 POST 方法表单使用 submit_form 引发 TypeError

python - "query"字典的 Pythonic 方式

python - 如何用新数据更新 SVM 模型

Python:ValueError:使用序列设置数组元素

python - 如何将 Python 与不同的 SOCKS 5 代理一起使用?

python - csv 中第一位数字的数字频率,不导入

Python-pygame 如何在 Sprite 之间进行光环碰撞?

python - SQLalchemy - 迭代所有映射表