对于 df1:
Country fruit low high
0 Spain orange 100 20000
1 Italy apple 500 50000
2 Aus grape 300 10000
和 df2:
City fruit low high
0 sample1 orange 50 200
1 sample1 apple 10 400
2 sample2 orange 25000 50000
3 sample3 orange 50 300
4 sample3 grape 350 1000
5 sample3 grape 10 100
如果 df2 中“低”和“高”之间的范围包含在 df1 中的“低”和“高”范围内,我想根据“水果”匹配行并从 df1 中提取行。所以预期的输出将是:
City fruit low high Country fruit low high
0 sample1 orange 50 200 Spain orange 100 20000
1 sample3 orange 50 300 Spain orange 100 20000
2 sample3 grape 350 1000 Aus grape 300 10000
我想它可以这样开始:
for sample, subdf in df2.groupby("fruit"):
for index, row in subdf.iterrows():
最佳答案
使用DataFrame.merge
通过外部连接和过滤 boolean indexing
:
df1 = df2.merge(df1, on='fruit', how='outer', suffixes=('','1'))
df2 = df1[(df1.low1 <= df1.high) & (df1.high1 >= df1.low)]
print (df2)
City fruit low high Country low1 high1
0 sample1 orange 50 200 Spain 100 20000
2 sample3 orange 50 300 Spain 100 20000
4 sample3 grape 350 1000 Aus 300 10000
关于Python - 比较两个数据帧之间的范围,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59387488/