我正在尝试聚合具有不同自定义函数的 pd.Dataframe
,尤其是来自 scipy.stats
的函数。我可以让它与单个函数一起工作,在这种情况下 trim_mean
:
import pandas as pd
import numpy as np
from scipy.stats import trim_mean
df = pd.DataFrame(np.random.randn(100, 3), columns=['A', 'B', 'C'], index=pd.date_range('1/1/2000', periods=100))
# this works as expected
df.agg([np.sum, np.mean])
# now with a different function, works also
df.agg(lambda x: trim_mean(x, 0.2))
# apply also works
df.apply(lambda x: trim_mean(x, 0.2))
但是,df.agg([lambda x: trim_mean(x, 0.2)])
生成一个 IndexError: tuple index out of range'
和 一样df.apply([lambda x: trim_mean(x, 0.2)])
.
我找到了一个 old issue on pandas-dev但这对我来说没有意义。
有人帮忙吗?
最佳答案
您需要在函数列表之前使用 lambda
,返回 DataFrame
使用 Series
:
c = ['trim_mean','mean','sum']
print (df.agg(lambda x: pd.Series([trim_mean(x, 0.2), np.mean(x), np.sum(x)], index=c)))
或者:
print (df.apply(lambda x: pd.Series([trim_mean(x, 0.2), np.mean(x), np.sum(x)], index=c)))
A B C
trim_mean -0.143219 -0.018430 -0.097768
mean -0.171887 -0.042308 -0.004843
sum -17.188738 -4.230797 -0.484343
验证:
print (df.agg([np.sum, np.mean]))
A B C
sum -17.188738 -4.230797 -0.484343
mean -0.171887 -0.042308 -0.004843
print(df.agg(lambda x: trim_mean(x, 0.2)))
A -0.143219
B -0.018430
C -0.097768
dtype: float64
关于python - 如何将自定义函数列表传递给 pandas.Dataframe.aggregate,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48767067/