我有两个数据框,一个是出生的人的姓名及其每年(1880-2017)的频率。
name gender frequency year
Mary F 7065 1880
Anna F 2604 1880
Emma F 2003 1880
Elizabeth F 1939 1880
Minnie F 1746 1880
...
另一个是年份和出生总数(1880-2017)。
birth_year Male Female Total
1880 118400 97605 216005
1881 108282 98855 207137
1882 122031 115695 237726
1883 112477 120059 232536
1884 122738 137586 260324
...
这些数据框的大小不同,但如果出生年份相同,我想将第二个数据框的列 append 到第一个数据框,以便包含人口百分比。我想做这样的事情:
for i in range(len(all_names_nat_DF)):
for j in range(len(total_births)):
if all_names_nat_DF['year'][i] == total_births['birth_year']:
all_names_nat_DF.append(total_births['birth_year'][j])
但是这样我得到了错误 ValueError: Can only compare identically-labeled Series objects
最佳答案
您想要使用df.merge
:
df
name gender frequency year
0 Mary F 7065 1880
1 Anna F 2604 1880
2 Emma F 2003 1880
3 Eliz F 1939 1880
4 Minnie F 1746 1880
births
birth_year Male Female Total
0 1880 118400 97605 216005
1 1881 108282 98855 207137
2 1882 122031 115695 237726
3 1883 112477 120059 232536
4 1884 122738 137586 260324
df.merge(births, how='inner', left_on='year', right_on='birth_year')
name gender frequency year birth_year Male Female Total
0 Mary F 7065 1880 1880 118400 97605 216005
1 Anna F 2604 1880 1880 118400 97605 216005
2 Emma F 2003 1880 1880 118400 97605 216005
3 Eliz F 1939 1880 1880 118400 97605 216005
4 Minnie F 1746 1880 1880 118400 97605 216005
关于python - 如何有条件地将 pandas 系列 append 到另一个数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55284314/