我想知道是否有更好的方法来计算 Pandas 中父总份额,与下面的方法相比:非常感谢您的帮助!
raw_data = {'product': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'revenue': [10,20,20,0,50,50,0,0,30]}
df = pd.DataFrame(raw_data, columns = ['product', 'revenue'])
unique_values = df['product'].unique()
L = pd.DataFrame ()
for value in unique_values:
small_df = df[df['product']==value]
small_df['shares'] = small_df['revenue']/small_df['revenue'].sum()
L = L.append(small_df, ignore_index=True)
print(L)
最佳答案
试试这个:
df['shares'] = df.groupby('product')['revenue'].apply(lambda x: x/ x.sum())
In [898]: df
Out[898]:
product revenue shares
0 A 10 0.2
1 A 20 0.4
2 A 20 0.4
3 B 0 0.0
4 B 50 0.5
5 B 50 0.5
6 C 0 0.0
7 C 0 0.0
8 C 30 1.0
关于python - 计算 Pandas DataFrame 中父级总数的份额,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52909883/