python pandas : Check if dataframe's column value is in another dataframe's column, 然后计数并列出它

在这里学习Python，非常感谢对此的任何帮助。我有一个由两部分组成的问题，尽管我已经为第一部分创建了解决方案，但必须有一种更加Pythonic的方法来实现目标。第二部分，不太确定如何进行。

我在两个单独的数据框中有唯一 ID 的列。我想计算 df_2 uid 列中的 uid 在 df_1 的 uid 列中出现的次数，然后将该 uid 添加到列表中(如果两者都是)。下面的代码示例对我有用，但我担心某个地方有问题，必须有更好的方法。

data = {'uid':['uid1', 'uid2', 'uid3', 'uid4'], 'value': [1, 2, 3, 4]}
df = pd.DataFrame(data)

data1 = {'uid':['uid4', 'uid2', 'uid5'], 'value1': ["", 5, 6]}
df1 = pd.DataFrame(data1)

count_val_in_both_df = 0
list_val_in_both_df = []
for x in range(len(df1.iloc[:, 0])) :
    if df1.iloc[x, 0] in df.iloc[:, 0].values :
        count_val_in_both_df += 1
        list_val_in_both_df.append(df1.iloc[x, 0])        
print('count = ' + str(count_val_in_both_df))
print(list_val_in_both_df)

哪些输出:

df
    uid  value
0  uid1      1
1  uid2      2
2  uid3      3
3  uid4      4


df1
    uid value1
0  uid4       
1  uid2      5
2  uid5      6


count = 2
['uid4', 'uid2']

第二部分是在 df 中为 df1 中的值创建一列，并添加 df1 中的值。我对这部分很迷茫，但想要这样的结果:

{    uid  value value1
0  uid1      1       
1  uid2      2      5
2  uid3      3       
3  uid4      4       }

最佳答案

您可以使用合并

df.merge(df1, on = 'uid', how = 'left').fillna('')

    uid value   value1
0   uid1    1   
1   uid2    2   5
2   uid3    3   
3   uid4    4

对于问题的第一部分，您可以使用交集

list_val_in_both_df  = list(set(df.uid).intersection(set(df1.uid)))

你得到了

['uid2', 'uid4']

关于python pandas : Check if dataframe's column value is in another dataframe's column, 然后计数并列出它，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45970258/

上一篇：python - 使用 pandas 使用 bool 方法选择列数据集

下一篇：python - 递归:带有 `scipy.lfilter` 的 IIR 滤波器

python - Pandas 获取具有复合索引的数据框的行号

r - 为什么对数据帧进行子集化会改变时间序列的类别？

python - Aiogram——为确切的用户设置状态

python - numpy协方差和协方差矩阵通过公式产生不同的结果

python - 如何将 3 个列表合并到一个字典中(具有唯一元素)

python - PANDAs 在组内创建序数递增值列

python - Pandas 数据框在没有 for 循环的情况下迭代行

python - 即使向下滚动，html 表头也锁定在页面顶部

python - 有条件聚合 Pandas DataFrame