df1 = DataFrame(np.arange(6).reshape(3, 2), index=['a', 'b', 'c'],
columns=['one', 'two'])
df2 = DataFrame(5 + np.arange(4).reshape(2, 2), index=['a', 'c'],
columns=['three', 'four'])
>>> df1
one two
a 0 1
b 2 3
c 4 5
>>> df2
three four
a 5 6
c 7 8
res = pd.concat([df1, df2], axis=1, levels=['level1', 'level2'],
names=['upper', 'lower'])
>>> res
one two three four
a 0 1 5 6
b 2 3 NaN NaN
c 4 5 7 8
我的问题是为什么级别和名称没有显示在上面的 res 输出中?任何真实的例子是如何使用级别选项的?
感谢您的时间和帮助
最佳答案
非常有趣的问题。
我在 SO 做研究但从未使用过:(
但是在docs是一个带有通知的示例:
Yes, this is fairly esoteric, but is actually necessary for implementing things like
GroupBy
where the order of a categorical variable is meaningful.
还有 docs
说:
levels : list of sequences, default None. Specific levels (unique values) to use for constructing a MultiIndex. Otherwise they will be inferred from the keys.
因此它向 MultiIndex
添加新级别:
res = pd.concat([df1, df2], axis=1,
keys=['level1','level2'],
levels=[['level1', 'level2','level3']],
names=['upper', 'lower'])
print (res)
upper level1 level2
lower one two three four
a 0 1 5.0 6.0
b 2 3 NaN NaN
c 4 5 7.0 8.0
print (res.columns)
MultiIndex(levels=[['level1', 'level2', 'level3'], ['four', 'one', 'three', 'two']],
labels=[[0, 0, 1, 1], [1, 3, 2, 0]],
names=['upper', 'lower'])
同样没有参数levels
:
res = pd.concat([df1, df2], axis=1,
keys=['level1','level2'],
names=['upper', 'lower'])
print (res)
upper level1 level2
lower one two three four
a 0 1 5.0 6.0
b 2 3 NaN NaN
c 4 5 7.0 8.0
print (res.columns)
MultiIndex(levels=[['level1', 'level2'], ['four', 'one', 'three', 'two']],
labels=[[0, 0, 1, 1], [1, 3, 2, 0]],
names=['upper', 'lower'])
关于python - pandas concat 中的级别选项,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44262676/