python - 查找一个列表的任何元素在另一个列表中出现的索引，重复项

标签 python python-3.x

Python 新手，来自 MATLAB。我的问题与这篇文章 ( Find the indices at which any element of one list occurs in another ) 非常相似，但有一些我无法完全整合的调整(即管理重复项和缺失值)。

按照这个例子，我有两个列表，haystack 和 needles:

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']

但是，haystack 和 needles 都是日期列表。我需要在 haystack 中为 haystack 中针的每个元素创建一个索引列表，这样:

result = [5, 6, 7, nan, 5, 9]

我的问题与发布的示例之间的两大区别是: 1. 我有重复的针(haystack 没有任何重复的)，据我所知这意味着我不能使用 set() 2. 在极少数情况下，needles 中的元素可能不在 haystack 中，在这种情况下我想插入一个 nan(或其他占位符)

到目前为止我已经得到了这个(对于有多大的干草堆和针头来说效率不够高):

import numpy as np

def find_idx(a,func):
    return [i for (i,val) in enumerate(a) if func(val)]

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles = ['F', 'G', 'H', 'I', 'F', 'K']

result=[]
for x in needles:
    try:
        idx = find_idx(haystack, lambda y: y==x)
        result.append(idx[0])
    except:
        result.append(np.nan)

据我所知，该代码可以满足我的要求，但速度不够快。更有效的替代方案？

最佳答案

如果你的数组非常大，那么制作一个字典来索引大海捞针可能是值得的:

haystack = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K']
needles  = ['F', 'G', 'H', 'I', 'F', 'K']

hayDict  = { K:i for i,K in enumerate(haystack) }
result   = [ hayDict.get(N,np.nan) for N in needles]

print(result)

# [5, 6, 7, nan, 5, 9]

关于python - 查找一个列表的任何元素在另一个列表中出现的索引，重复项，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56589451/

上一篇：python - 如何将 python-trio 与谷歌 Protocol Buffer 一起使用？

下一篇：python - Python 中 ProcessPoolExecutor 的运行调用次数不正确

python - 如何在谷歌colab中安装keras_contrib？

python-3.x - 错误 : symeig_cpu: the algorithm failed to converge: 6 off-diagonal elements of an intermediate tridiagonal form did not converge to zero

python - 如何通过 Python 中的 Sagemath 库理解可疑的语法应用

python - 从文本文件打印随机行时出错

python - 我如何判断一个类是否有方法 `__call__` ？

python - 为什么元组在分配相同的值时不会获得相同的 ID？

python - 使用 numpy 将矩阵 int 值映射到 str 的有效方法

Python检查字符串是否包含字典的任何键

python - 如何在 Django 模板上访问多维字典