我无法以有效的方式将单行添加到 MultiIndexed DataFrame。通过添加行,MultiIndex 被展平为简单的元组索引。奇怪的是,这对于 MultiIndexed 列来说不是问题。
系统信息:
Python 3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'0.19.2'
示例数据:具有 MultiIndex 行和列的 DataFrame
import numpy as np
import pandas as pd
index = pd.MultiIndex(levels=[['bar', 'foo'], ['one', 'two']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['row_0', 'row_1'])
columns = pd.MultiIndex(levels=[['dull', 'shiny'], ['a', 'b']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['col_0', 'col_1'])
df = pd.DataFrame(np.ones((4,4)),columns=columns, index=index)
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
向DataFrame添加一个额外的列是没有问题的:
df['last_col'] = 42 #define a new column and assign a value
print(df)
col_0 dull shiny last_col
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0 42
two 1.0 1.0 1.0 1.0 42
foo one 1.0 1.0 1.0 1.0 42
two 1.0 1.0 1.0 1.0 42
但是,如果我对添加行执行相同操作(通过使用 loc),MultiIndex 将展平为 简单的元组索引:
df.loc['last_row'] = 43 #define a new row and assign a value
print(df)
col_0 dull shiny last_col
col_1 a b a b
(bar, one) 1.0 1.0 1.0 1.0 42
(bar, two) 1.0 1.0 1.0 1.0 42
(foo, one) 1.0 1.0 1.0 1.0 42
(foo, two) 1.0 1.0 1.0 1.0 42
last_row 43.0 43.0 43.0 43.0 43
有没有人知道如何在不展平索引的情况下以既简单又高效的方式添加行?非常感谢!!
最佳答案
我认为您需要定义 MultiIndex
两个值的元组:
df.loc[('last_row', 'a'), :] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row a 43.0 43.0 43.0 43.0
对于专栏,它的工作方式类似:
df[('last_col', 'a')] = 43
print(df)
col_0 dull shiny last_col
col_1 a b a b a
row_0 row_1
bar one 1.0 1.0 1.0 1.0 43
two 1.0 1.0 1.0 1.0 43
foo one 1.0 1.0 1.0 1.0 43
two 1.0 1.0 1.0 1.0 43
编辑:
看来你需要定义列名,如果需要全部使用 :
:
df.loc['last_row',:] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 43.0 43.0 43.0
如果级别未定义则添加空字符串:
print(df.index)
MultiIndex(levels=[['bar', 'foo', 'last_row'], ['one', 'two', '']],
labels=[[0, 0, 1, 1, 2], [0, 1, 0, 1, 2]],
names=['row_0', 'row_1'])
df.loc['last_row','dull'] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 43.0 NaN NaN
df.loc['last_row', ('dull', 'a')] = 43
print(df)
col_0 dull shiny
col_1 a b a b
row_0 row_1
bar one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
foo one 1.0 1.0 1.0 1.0
two 1.0 1.0 1.0 1.0
last_row 43.0 NaN NaN NaN
关于python - 如何在不展平 MultiIndex 的情况下向 pandas DataFrame 添加一行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44949953/