我有以下数据框 (df):
loc pop_1 source_1 pop_2 source_2
a 99 group_a 77 group_b
b 93 group_a 90 group_b
c 58 group_a 59 group_b
d 47 group_a 62 group_b
我创建了一个附加列“upper_limit”:
df['upper_limit'] = df[['pop_1','pop_2']].max(axis=1)
我现在想添加另一列来查看“upper_limit”中的值,将它们与 pop_1 和 pop_2 进行比较,然后在它们匹配时从 source_1 或 source_2 中选择文本。即:
loc pop_1 source_1 pop_2 source_2 upper_limit source
a 99 group_a 77 group_b 99 group_a
b 93 group_a 90 group_b 93 group_a
c 58 group_a 59 group_b 59 group_b
d 47 group_a 62 group_b 62 group_b
我尝试通过以下方式从 pop_1 和 source_1 创建字典:
table_dict = df[['pop_1','source_1']]
z = table_dict.to_dict
然后使用以下方式映射:
df['source'] = 'n/a'
df['source'].replace(z,inplace=True)
这将返回数据框,但“源”列仅显示 n/a 结果。
最佳答案
I now want to add another column that looks at the values in 'upper_limit', compares them to pop_1 and pop_2 and then selects the text from source_1 or source_2 when they match.
您可以使用 np.where
更简单地做到这一点:
In [19]: import numpy as np
In [20]: df['upper_limit source'] = np.where(df.upper_limit == df.pop_1, df.source_1, df.source_2)
In [20]: df
Out[20]:
loc pop_1 pop_2 source_1 source_2 upper_limit upper_limit source
0 a 99 77 group_a group_b 99 group_a
1 b 93 90 group_a group_b 93 group_a
2 c 58 59 group_a group_b 59 group_b
3 d 47 62 group_a group_b 62 group_b
关于python - 使用多个标准 pandas python 查找和替换,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39274824/