python - 根据阈值减少列表中元组的数量

我有一个包含多个元组的列表。

兴趣点 = [(3,2),(6,2),(6,5),(10,1),(12,-2),(5,7)]

POI 内的元组是检测到的对象的 2D 坐标。我想做的是检查这些对象之间的欧几里得距离，这样如果每两个对象之间的距离小于 3，那么这两个对象被视为一个对象，新的坐标将是它们之间的平均距离。

从概念上讲，这看起来像一个聚类问题，需要检查每个对象与另一个对象的距离并计算距离，如下

# example, hier only the first two points are considered
d = np.sqrt(np.abs(POIs[0][0]-POIs[1][0]) + np.abs(POIs[0][1]-POIs[1][1]))

 if (d<3):
   POIs[0] = (POIs[0][0]+POIs[1][0]/2 , POIs[0][1]+POIs[1][1]/2)

所以，我的问题是这是否可以在计算上进行优化，因为当列表包含大量元组/对象时，情况会变得更糟

最佳答案

这是使用 SciPy 的解决方案 fclusterdata :

import pandas as pd
from scipy.cluster.hierarchy import fclusterdata

max_dist = 3
points = [(3, 2), (6, 2), (6, 5), (10, 1), (12, -2), (5, 7)]

clusters = fclusterdata(points, t=max_dist, criterion='distance')
clustered_points = (pd.DataFrame(points, columns=['x', 'y'], index=clusters)
                      .rename_axis(index='cluster'))
cluster_centroids = (clustered_points.groupby(clusters).mean()
                                     .rename_axis(index='cluster'))

这是您想要的结果:

>>> clustered_points
          x  y
cluster       
1         3  2
1         6  2
1         6  5
2        10  1
3        12 -2
1         5  7
>>> cluster_centroids
          x  y
cluster       
1         5  4
2        10  1
3        12 -2

关于python - 根据阈值减少列表中元组的数量，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/67267137/

上一篇：c++ - 如何在 Armadillo 中序列化稀疏矩阵并与boost的mpi实现一起使用？

下一篇：dart - 具有多个参数类型的抽象类方法

python - mayavi points3d plot动画不会更新下一帧

python - 从 matplotlib 感知均匀尺度中检索颜色

python - 如何对列表中的元素进行操作？

Python 列表差异与出现次数

python - 多重矩阵乘法

python - 我如何使用 NumpyArrayToRaster() 从 4 波段图像中提取 3 波段？

python - 从 MultiIndex Pandas 数据框中删除一列

python - redis.exceptions.ConnectionError : Error 97 connecting to localhost:6379. 协议(protocol)不支持的地址族

python - 将 Pandas 列的列表拆分为多列