我在使用不等长度列表中的值填充 Pandas 数据帧时遇到问题。
nx_lists_into_df
是 numpy 数组的列表。
我收到以下错误:
ValueError: Length of values does not match length of index
代码如下:
# Column headers
df_cols = ["f1","f2"]
# Create one dataframe fror each sheet
df1 = pd.DataFrame(columns=df_cols)
df2 = pd.DataFrame(columns=df_cols)
# Create list of dataframes to iterate through
df_list = [df1, df2]
# Lists to be put into the dataframes
nx_lists_into_df = [[array([0, 1, 3, 4, 7]),
array([2, 5, 6, 8])],
[array([0, 1, 2, 6, 7]),
array([3, 4, 5, 8])]]
# Loop through each sheet (i.e. each round of k folds)
for df, test_index_list in zip_longest(df_list, nx_lists_into_df):
counter = -1
# Loop through each column in that sheet (i.e. each fold)
for col in df_cols:
print(col)
counter += 1
# Add 1 to each index value to start indexing at 1
df[col] = test_index_list[counter] + 1
感谢您的帮助。
编辑:结果应该是这样的:-
print(df1)
f1 f2
0 0 2
1 1 5
2 3 6
3 4 8
4 7 NaN
print(df2)
f1 f2
0 0 3
1 1 4
2 2 5
3 6 8
4 7 NaN
最佳答案
我们将利用 pd.Series
附加适当的索引,并允许我们使用 pd.DataFrame
构造函数,而不会提示长度不等。
df1, df2 = (
pd.DataFrame(dict(zip(df_cols, map(pd.Series, d))))
for d in nx_lists_into_df
)
<小时/>
print(df1)
f1 f2
0 0 2.0
1 1 5.0
2 3 6.0
3 4 8.0
4 7 NaN
<小时/>
print(df2)
f1 f2
0 0 3.0
1 1 4.0
2 2 5.0
3 6 8.0
4 7 NaN
<小时/>
设置
from numpy import array
nx_lists_into_df = [[array([0, 1, 3, 4, 7]),
array([2, 5, 6, 8])],
[array([0, 1, 2, 6, 7]),
array([3, 4, 5, 8])]]
# Column headers
df_cols = ["f1","f2"]
关于python - 用不等长度的列表填充 Pandas 列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49140589/