我希望得到一些帮助,将下面详细介绍的数据框总结为一行摘要,如页面下方所需的输出所示。非常感谢。
employees = {'Name of Employee': ['Mark','Mark','Mark','Mark','Mark','Mark', 'Mark','Mark','Mark','Mark','Mark','Mark','Mark'],
'Department': ['21','21','21','21','21','21', '21','21','21','21','21','21','21'],
'Team': ['2','2','2','2','2','2','2','2','2','2','2','2','2'],
'Log': ['2020-02-19 09:01:17', '2020-02-19 09:54:02', '2020-04-10 11:00:31', '2020-04-11 12:39:08', '2020-04-18 09:45:22', '2020-05-05 09:01:17', '2020-05-23 09:54:02', '2020-07-03 11:00:31', '2020-07-03 12:39:08', '2020-07-04 09:45:22', '2020-07-05 09:01:17', '2020-07-06 09:54:02', '2020-07-06 11:00:31'],
'Call Duration' : ['0.01178', '0.01736','0.01923','0.00911','0.01007','0.01206','0.01256','0.01006','0.01162','0.00733','0.01250','0.01013','0.01308'],
'ITT': ['NO','YES', 'NO', 'Follow up', 'YES','YES', 'NO', 'Follow up','YES','YES', 'NO','YES','YES']
}
df = pd.DataFrame(employees)
期望的输出:
Name Dept Team Start End Weeks Total Calls Ave. Call time Sold Rejected more info
Mark 21 2 2020-02-19 2020-07-06 19.71 13 0.01207 7 4 2
我寻求应用的逻辑是(虽然我猜测我在下面编写的语法中有错误,但我希望您仍然能够理解计算):
- 开始 = df['Log'] 中的最短日期
- End = df['Log'] 中的最大日期
- 周 =(df['log'] 中的最大日期 - df['Log'] 中的最小日期)/7
- 总调用次数 = df['Log'].count
- 大道。通话时间 = (df['通话时长'].sum)/(df['Log'].count)
- 已售出 = (df['ITT']=='YES').count
- 已拒绝 = (df['ITT']=='NO').count
- 更多信息 = (df['ITT']=='跟进').count
最佳答案
尝试使用 pd.NamedAgg
和 groupby
:
df['Log'] = pd.to_datetime(df['Log'])
df['Call Duration'] = df['Call Duration'].astype(float)
df.groupby(['Name of Employee', 'Team', 'Department'])\
.agg(Start = ('Log','min'),
End = ('Log', 'max'),
Weeks = ('Log', lambda x: np.ptp(x) / np.timedelta64(7, 'D')),
Total_Calls = ('Log', 'count'),
Avg_Call_Time = ('Call Duration', 'mean'),
Sold = ('ITT', lambda x: (x == 'YES').sum()),
Rejected = ('ITT', lambda x: (x == 'NO').sum()),
More_info = ('ITT', lambda x: (x=='Follow up').sum()))
输出:
Start End Weeks Total_Calls Avg_Call_Time Sold Rejected More_info
Name of Employee Team Department
Mark 2 21 2020-02-19 09:01:17 2020-07-06 11:00:31 19.726114 13 0.012068 7 4 2
关于python - 在单行中总结 pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64251277/