python - pandas 数据框用 nan 填充缺失的行

标签 python pandas dataframe

到数据框df1

df1=pd.DataFrame(data=[[1,2,3],[2,4,5],[3,6,7],[1,2,3],[1,4,5],[2,6,7]],columns=['day','d','c'],index=[32,32,32,44,55,55])
print(df1)
    day  d  c
32    1  2  3
32    2  4  5
32    3  6  7
44    1  2  3
55    1  4  5
55    2  6  7

我想根据需要添加尽可能多的行,以便每个索引的列天数从 1 到 5从第一天开始,其他列应填充 NaN

df2=pd.DataFrame(data=[[1,np.nan,np.nan],[2,np.nan,np.nan],[3,2,3],[4,4,5],[5,6,7],
                       [1,np.nan,np.nan],[2,np.nan,np.nan],[3,np.nan,np.nan],[4,np.nan,np.nan],[5,2,3],
                       [1,np.nan,np.nan],[2,np.nan,np.nan],[3,np.nan,np.nan],[4,4,5],[5,6,7]],
                       columns=['day','d','c'],index=[32,32,32,32,32,44,44,44,44,44,55,55,55,55,55])
print(df2)
    day    d    c
32    1  NaN  NaN
32    2  NaN  NaN
32    3  2.0  3.0
32    4  4.0  5.0
32    5  6.0  7.0
44    1  NaN  NaN
44    2  NaN  NaN
44    3  NaN  NaN
44    4  NaN  NaN
44    5  2.0  3.0
55    1  NaN  NaN
55    2  NaN  NaN
55    3  NaN  NaN
55    4  4.0  5.0
55    5  6.0  7.0

最佳答案

用途:

N = 5
def f(x):
    x = x.astype(float)
    x.index = range(N + 1 - len(x), N + 1)
    return x.reindex(range(1, N + 1))

df1 = df1.groupby(level=0)['d','c'].apply(f).rename_axis((None, 'day')).reset_index(level=1)
print (df1)
    day    d    c
32    1  NaN  NaN
32    2  NaN  NaN
32    3  2.0  3.0
32    4  4.0  5.0
32    5  6.0  7.0
44    1  NaN  NaN
44    2  NaN  NaN
44    3  NaN  NaN
44    4  NaN  NaN
44    5  2.0  3.0
55    1  NaN  NaN
55    2  NaN  NaN
55    3  NaN  NaN
55    4  4.0  5.0
55    5  6.0  7.0

关于python - pandas 数据框用 nan 填充缺失的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48579238/

相关文章:

python - 在 Ubuntu 18.04 上安装 CNTK for Python 3.6

python - 通过从其他 DataFrame 中选择值来在 Pandas DataFrame 中填充 NaN

r - 如何将数据框中的列转换为行名

python - 使用python或pandas合并多个文件

python - 将 Excel 中的字符串输入循环转换为 Python 列表

python - 艰难地学习 Python Ex 25 : local variable/object assignment in functions

python - 只能将 '.sparse' 访问器与稀疏数据一起使用

python - 如何在 Travis CI 上构建 MacOSX 可执行文件?

python - csv 文件作为 Pandas 数据框时的数据类型问题

python - 在 Pandas 中获取每个月的最后一个非 NaN 值