python - 在列中查找与其他数据帧列中的任何其他值匹配的行号

我有一个代码:

import pandas as pd
import numpy as np

arm_1_and_m1_df = pd.DataFrame({ 'record_id': [1, 4, 3, np.nan],
                   'two': [1, 2, np.nan , 4]
                 })

redcap_final_arm1_data = pd.DataFrame({ 'record_id': [1, 2, 3, 4, 5, 6, 7, 8, 9, np.nan],
                   'two': [1, 2, 3, 4, 5, 6, 7, 8, 9, np.nan]
                 })

ahk_ids_new=[]
for items in arm_1_and_m1_df['record_id'].iteritems():     # https://www.geeksforgeeks.org/python-pandas-series-iteritems/
    ahk_ids_new.append(np.where(redcap_final_arm1_data['record_id'] == items))    # https://stackoverflow.com/questions/48519062/rs-which-and-which-min-equivalent-in-python

运行上面的代码和ahk_ids_new之后，ahk_ids_new的内容是:

[(array([], dtype=int64),),
 (array([], dtype=int64),),
 (array([], dtype=int64),),
 (array([], dtype=int64),)]

redcap_final_arm1_data['record_id'] 中的值是唯一的。

问题:我想获取 ahk_ids_new 中 redcap_final_arm1_data['record_id'] 的所有行号(索引)，其中 redcap_final_arm1_data ['record_id'] 与 arm_1_and_m1_df['record_id'] 中的任何值具有相同的值。怎么做？

ahk_ids_new的预期输出(内容):

Out[57]: [0, 3, 2, 9]

如果有更好的方法来处理我的代码中的数据帧，请发布您更好的变体，而不是修复我的代码。

最佳答案

尝试使用 isin 并在索引上进行切片

a_index = (redcap_final_arm1_data.index[redcap_final_arm1_data.record_id
                                           .isin(arm_1_and_m1_df.record_id)].tolist())

输出:

Out[1355]: [0, 2, 3, 9]

关于python - 在列中查找与其他数据帧列中的任何其他值匹配的行号，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60512539/

上一篇：nestjs - 如何将 fastify 速率限制器应用于 Nest JS 中的单个路由？

下一篇：正则表达式到有限自动机

相关文章：

python - 如何使用 python、openCV 计算图像中的行数

python - Tkinter 用 {} 显示文本

python - Pandas - 检查列中的值是否匹配两种格式之一

python - Pandas 缩编持续时间

python - 如何删除网格布局小部件内的所有小部件并以相同的顺序重新创建所有小部件

python - 加速数据帧从字典生成代码

python - 从正则表达式模式返回不匹配的行

python - ('42000', '[42000] [Microsoft][ODBC Microsoft Access Driver] Syntax error in INSERT INTO statement. (-3502) (SQLExecDirectW)')

python - Py2Neo 没有正确创建日期时间数据类型？

python - 高斯滤波器后合并接近的对象