python - 将一对二的 x-y 数据分为顶部和底部集

标签 python numpy geometry cluster-analysis computational-geometry

我有一个数据集,其中两个 y 值与每个 x 值关联。如何将数据分为“上限”和“下限”值?

下面,我展示了一个包含此类数据集的示例。我显示了所需的“顶部”和“底部”分组的图像(红色是顶部,紫色是底部)。到目前为止,我最好的想法是使用迭代方法找到一条分割顶部和底部数据的线。这个解决方案很复杂并且效果不是很好,所以我没有包括它。

import matplotlib.pyplot as plt
import numpy as np

# construct data using piecewise functions
x1 = np.linspace(0, 0.7, 70)
x2 = np.linspace(0.7, 1, 30)
x3 = np.linspace(0.01, 0.999, 100)
y1 = 4.164 * x1 ** 3
y2 = 1 / x2
y3 = x3 ** 4 - 0.1

# concatenate data
x = np.concatenate([x1, x2, x3])
y = np.concatenate([y1, y2, y3])

# I want to be able divide the data by top and bottom,
#  like shown in the chart. The black is the unlabeled data
#  and the red and purple show the top and bottom
plt.scatter(x, y, marker='^', s=10, c='k')
plt.scatter(x1, y1, marker='x', s=0.8, c='r')
plt.scatter(x2, y2, marker='x', s=0.8, c='r')
plt.scatter(x3, y3, marker='x', s=0.8, c='purple')
plt.show()

enter image description here

最佳答案

您可以通过重新排序数据来创建分割线。按 x 对所有内容进行排序,然后应用高斯滤波器。两个数据集严格高于或低于高斯滤波器的结果:

import matplotlib.pyplot as plt
from scipy.ndimage.filters import gaussian_filter1d
import numpy as np

# construct data using piecewise functions
x1 = np.linspace(0, 0.7, 70)
x2 = np.linspace(0.7, 1, 30)
x3 = np.linspace(0.01, 0.999, 100)
y1 = 4.164 * x1 ** 3
y2 = 1 / x2
y3 = x3 ** 4 - 0.1

# concatenate data
x = np.concatenate([x1, x2, x3])
y = np.concatenate([y1, y2, y3])

# I want to be able divide the data by top and bottom,
#  like shown in the chart. The black is the unlabeled data
#  and the red and purple show the top and bottom


idx = np.argsort(x)
newy = y[idx]
newx = x[idx]
gf = gaussian_filter1d(newy, 5)
plt.scatter(x, y, marker='^', s=10, c='k')
plt.scatter(x1, y1, marker='x', s=0.8, c='r')
plt.scatter(x2, y2, marker='x', s=0.8, c='r')
plt.scatter(x3, y3, marker='x', s=0.8, c='purple')
plt.scatter(newx, gf, c='orange')
plt.show()

enter image description here

关于python - 将一对二的 x-y 数据分为顶部和底部集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57485437/

相关文章:

python - 如何正确配置 Luigi 任务重试?

javascript - 以顺序方式计算圆坐标

matlab - 复向量的成对角度计算

python - 如何计算任意幂的矩阵,但假装数字的符号?

python - 没有名为 _backend_gdk 的模块

Python继承: modify a parent class of an object

python - 从另一个包含字典键的数组构造值数组

python - CVXPY 中的外积

python - Matplotlib FuncAnimation 逐步动画函数

Java找到两条线的交点