yearCount = df[['antibiotic', 'order_date', 'antiYearCount']]
yearGroups = yearCount.groupby('order_date')
for year in yearGroups:
yearCount['antiYearCount'] =year.groupby('antibiotic'['antibiotic'].transform(pd.Series.value_counts)
在这种情况下,yearCount
是一个包含 'order_date'、'antibiotic'、'antiYearCount'
的数据框。我已将 'order_date'
清理为仅包含订单年份。我想按 'order_date'
中的年份对 yearCount
进行分组,计算每个 'antibiotic'
在每个“年组”中出现的次数然后将该值分配给 yearCount
的 'antiYearCount'
变量。
最佳答案
我认为您需要将新列 order_date
添加到 groupby
然后也可以使用 size
代替 pd.Series.value_counts
对于相同的输出:
df = pd.DataFrame({'antibiotic':list('accbbb'),
'antiYearCount':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'order_date': pd.to_datetime(['2012-01-01']*3+['2012-01-02']*3)})
print (df)
C D E antiYearCount antibiotic order_date
0 7 1 5 4 a 2012-01-01
1 8 3 3 5 c 2012-01-01
2 9 5 6 4 c 2012-01-01
3 4 7 9 5 b 2012-01-02
4 2 1 2 5 b 2012-01-02
5 3 0 4 4 b 2012-01-02
#copy for remove warning
#https://stackoverflow.com/a/45035966/2901002
yearCount = df[['antibiotic', 'order_date', 'antiYearCount']].copy()
yearCount['antiYearCount'] = yearCount.groupby(['order_date','antibiotic'])['antibiotic'] \
.transform('size')
print (yearCount)
antibiotic order_date antiYearCount
0 a 2012-01-01 1
1 c 2012-01-01 2
2 c 2012-01-01 2
3 b 2012-01-02 3
4 b 2012-01-02 3
5 b 2012-01-02 3
yearCount['antiYearCount'] = yearCount.groupby(['order_date','antibiotic'])['antibiotic'] \
.transform(pd.Series.value_counts)
print (yearCount)
antibiotic order_date antiYearCount
0 a 2012-01-01 1
1 c 2012-01-01 2
2 c 2012-01-01 2
3 b 2012-01-02 3
4 b 2012-01-02 3
5 b 2012-01-02 3
关于python - 您如何遍历 pandas Dataframe 中的组,对每个组进行操作,然后将值分配给原始 Dataframe?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45364063/