基于 this link: 的输出数据帧
import pandas as pd
import numpy as np
np.random.seed(2021)
dates = pd.date_range('20130226', periods=90)
df = pd.DataFrame(np.random.uniform(0, 10, size=(90, 6)), index=dates, columns=['A_values', 'B_values', 'C_values', 'D_values', 'E_values', 'target'])
# all your models
models = df.columns[df.columns.str.endswith('_values')]
# function to calculate mape
def mape(y_true, y_pred):
y_pred = np.array(y_pred)
return np.mean(np.abs(y_true - y_pred) / np.clip(np.abs(y_true), 1, np.inf),
axis=0)*100
errors = (df.groupby(pd.Grouper(freq='M'))
.apply(lambda x: mape(x[models], x[['target']]))
)
res = pd.merge_asof(df[['target']], errors,
left_index=True,
right_index=True,
direction='forward'
)
print(res)
输出:
target A_values B_values C_values D_values E_values
2013-02-26 1.281624 48.759348 77.023855 325.376455 74.422508 60.602101
2013-02-27 0.585713 48.759348 77.023855 325.376455 74.422508 60.602101
2013-02-28 9.638430 48.759348 77.023855 325.376455 74.422508 60.602101
2013-03-01 1.950960 98.909249 143.760594 90.051465 138.059241 93.461361
2013-03-02 0.690563 98.909249 143.760594 90.051465 138.059241 93.461361
... ... ... ... ... ...
2013-05-22 5.554824 122.272490 139.420056 133.658101 62.368310 94.334362
2013-05-23 8.440801 122.272490 139.420056 133.658101 62.368310 94.334362
2013-05-24 0.968086 122.272490 139.420056 133.658101 62.368310 94.334362
2013-05-25 0.672555 122.272490 139.420056 133.658101 62.368310 94.334362
2013-05-26 5.273122 122.272490 139.420056 133.658101 62.368310 94.334362
如何按年月分组并找到最小的前 N 值列?
例如,如果我设置N=3
,那么预期结果将是:
感谢您提前提供的帮助。
最佳答案
这是 argsort 的一种方法:
errors = (df.groupby(pd.Grouper(freq='M'))
.apply(lambda x: mape(x[models], x[['target']]))
)
k = 2 # your k here
# filter top k models
sorted_args = np.argsort(errors, axis=1) < k
res = pd.merge_asof(df[['target']], sorted_args,
left_index=True,
right_index=True,
direction='forward'
)
topk = df[models].where(res[models])
然后topk
看起来像:
A_values B_values C_values D_values E_values
2013-02-26 6.059783 NaN NaN 3.126731 NaN
2013-02-27 1.789931 NaN NaN 7.843101 NaN
2013-02-28 9.623960 NaN NaN 5.612724 NaN
2013-03-01 NaN NaN 4.521452 NaN 5.693051
2013-03-02 NaN NaN 5.178144 NaN 7.322250
... ... ... ... ... ...
2013-05-22 NaN NaN 0.427136 NaN 6.803052
2013-05-23 NaN NaN 2.225667 NaN 2.756443
2013-05-24 NaN NaN 7.212742 NaN 0.430184
2013-05-25 NaN NaN 5.384490 NaN 5.461017
2013-05-26 NaN NaN 9.823048 NaN 6.312104
关于python - 按年月分组并在 Python 中查找前 N 个最小值列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69937232/