下面的代码将在 ONE 数据帧中生成所需的输出,但是,我想在 FOR 循环中动态创建数据帧,然后将移位值分配给该数据帧。例如,数据框 df_lag_12 将只包含 column1_t12 和 column2_12。任何想法将不胜感激。我尝试使用 EXEC 语句动态创建 12 个数据帧,谷歌搜索似乎表明这是不好的做法。
import pandas as pd
list1=list(range(0,20))
list2=list(range(19,-1,-1))
d={'column1':list(range(0,20)),
'column2':list(range(19,-1,-1))}
df=pd.DataFrame(d)
df_lags=pd.DataFrame()
for col in df.columns:
for i in range(12,0,-1):
df_lags[col+'_t'+str(i)]=df[col].shift(i)
df_lags[col]=df[col].values
print(df_lags)
for df in (range(12,0,-1)):
exec('model_data_lag_'+str(df)+'=pd.DataFrame()')
动态创建的数据帧 DF_LAGS_12 的期望输出:
var_list=['column1_t12','column2_t12']
df_lags_12=df_lags[var_list]
print(df_lags_12)
最佳答案
我认为最好的方法是创建 DataFrames 字典
:
d = {}
for i in range(12,0,-1):
d['t' + str(i)] = df.shift(i).add_suffix('_t' + str(i))
如果需要先指定列:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
d['t' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
dict理解
解决方案:
d = {'t' + str(i): df.shift(i).add_suffix('_t' + str(i)) for i in range(12,0,-1)}
print (d['t10'])
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
编辑:全局变量是否可能,但更好的是 dictionary
:
d = {}
cols = ['column1','column2']
for i in range(12,0,-1):
globals()['df' + str(i)] = df[cols].shift(i).add_suffix('_t' + str(i))
print (df10)
column1_t10 column2_t10
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 NaN NaN
6 NaN NaN
7 NaN NaN
8 NaN NaN
9 NaN NaN
10 0.0 19.0
11 1.0 18.0
12 2.0 17.0
13 3.0 16.0
14 4.0 15.0
15 5.0 14.0
16 6.0 13.0
17 7.0 12.0
18 8.0 11.0
19 9.0 10.0
关于Python Pandas 动态创建数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47109931/