我刚刚接触 Pandas 并尝试为一个 parking 场生成一个电子表格。我喜欢 Pandas,但它进展缓慢,我正在尝试生成一些总和的新列......
import pandas as pd
data = pd.DataFrame({"Car":["Hyundai","Hyundai","Honda", "Honda"], "Type":["Accent", "Accent", "Civic", "Civic"], "Trans":["Auto", "Manual", "Auto", "Manual"], "TOTAL":[2,4,5,3]})
print data
print data.groupby(['Car', 'Type', 'Trans'])['TOTAL'].sum()
我得到了完全可预测的....
Car TOTAL Trans Type
0 Hyundai 2 Auto Accent
1 Hyundai 4 Manual Accent
2 Honda 5 Auto Civic
3 Honda 3 Manual Civic
Car Type Trans
Honda Civic Auto 5
Manual 3
Hyundai Accent Auto 2
Manual 4
理想情况下,我想实现的是......
Car Type Auto Manual Total
Honda Civic 5 3 8
Hyundai Accent 2 4 6
我对 Pandas 的了解还不多,但我猜它是一个“apply”或一个 agg() 函数,但到目前为止,在语法上,我正在为语法错误而烦恼,但是我感谢任何指向正确方向的指示。 ..JW
最佳答案
要使用内置的 pandas
方法,您可以:设置您的 'Car'、'Type'、'Trans'
列
作为索引和 unstack()
获取每个子组的Total
,然后对列
求和:
data = pd.DataFrame({"Car":["Hyundai","Hyundai","Honda", "Honda"], "Type":["Accent", "Accent", "Civic", "Civic"], "Trans":["Auto", "Manual", "Auto", "Manual"], "TOTAL":[2,4,5,3]}).set_index(['Car', 'Type', 'Trans'])
total_by_trans = data.unstack().loc[:, 'TOTAL'] # to get rid of the column MultiIndex created by unstack()
total_by_trans['Total'] = total_by_trans.sum(axis=1)
total_by_trans.columns.name = None # just cleaning up
Auto Manual Total
Car Type
Honda Civic 5 3 8
Hyundai Accent 2 4 6
关于python - python数据帧的条件求和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34498112/