python - 给定 2 个整数列表如何找到不重叠的范围？

给定

x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]

目标是遍历 x_i 并找到大于 x_i 但不大于 x_i+ 的 y 值1

假设两个列表都已排序并且所有项目都是唯一的，给定 x 和 y 所需的输出是:

[(5, 8), (30, 35), (58, 60), (72, 73)]

我试过:

def per_window(sequence, n=1):
    """
    From http://stackoverflow.com/q/42220614/610569
        >>> list(per_window([1,2,3,4], n=2))
        [(1, 2), (2, 3), (3, 4)]
        >>> list(per_window([1,2,3,4], n=3))
        [(1, 2, 3), (2, 3, 4)]
    """
    start, stop = 0, n
    seq = list(sequence)
    while stop <= len(seq):
        yield tuple(seq[start:stop])
        start += 1
        stop += 1

x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]

r = []

for xi, xiplus1 in per_window(x, 2):
    for j, yj in enumerate(y):
        if yj > xi and yj < xiplus1:
            r.append((xi, yj))
            break

# For the last x value.
# For the last x value.
for j, yj in enumerate(y):
    if yj > xiplus1:
        r.append((xiplus1, yj))
        break

但是有没有一种更简单的方法可以用 numpy、pandas 或其他东西实现同样的效果？

最佳答案

您可以将 numpy.searchsorted 与 side='right' 一起使用，以找出 y 中第一个较大值的索引比 x 然后用索引提取元素；假设 y 中始终有一个值大于 x 中的任何元素的一个简单版本可能是:

x = np.array([5, 30, 58, 72])
y = np.array([8, 35, 53, 60, 66, 67, 68, 73])

np.column_stack((x, y[np.searchsorted(y, x, side='right')]))
#array([[ 5,  8],
#       [30, 35],
#       [58, 60],
#       [72, 73]])

给定 y 已排序:

np.searchsorted(y, x, side='right')
# array([0, 1, 3, 7])

返回 y 中第一个大于 x 中对应值的索引。

关于python - 给定 2 个整数列表如何找到不重叠的范围？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47047588/

python - 给定 2 个整数列表如何找到不重叠的范围？

上一篇：python - 将列表发送到 DynamoDB 时 ParameterVailidation 失败

下一篇：python - 如何在 Google App Engine 柔性环境中编辑 NGINX 配置？