python - 使用多索引将数据帧的一部分与另一部分进行比较

标签 python pandas

我有一个具有 3 级 MultiIndex 的数据框:

>>> np.random.seed(0)
>>> df = pd.DataFrame(np.random.randint(10, size=(18, 2)),
                      index=pd.MultiIndex.from_product([[True, False],
                                                        ['yes', 'no', 'maybe'],
                                                        ['one', 'two', 'three']],
                                                       names=['bool', 'ans', 'count']),
                      columns=['A', 'B'])
>>> df
                   A  B
bool  ans   count      
True  yes   one    5  0
            two    3  3
            three  7  9
      no    one    3  5
            two    2  4
            three  7  6
      maybe one    8  8
            two    1  6
            three  7  7
False yes   one    8  1
            two    5  9
            three  8  9
      no    one    4  3
            two    0  3
            three  5  0
      maybe one    2  3
            two    8  1
            three  3  3

我的目标是从具有相同 boolcount 的所有其他值中减去 maybe 值。减数是

>>> sub = df.loc[(slice(None), 'maybe', slice(None)), :]
>>> sub
                   A  B
bool  ans   count      
True  maybe one    8  8
            two    1  6
            three  7  7
False maybe one    2  3
            two    8  1
            three  3  3

问题是,当我尝试从其他项目中减去它时,索引与预期不匹配:

>>> df - sub
                     A    B
bool  ans   count          
False maybe one    0.0  0.0
            three  0.0  0.0
            two    0.0  0.0
      no    one    NaN  NaN
            three  NaN  NaN
            two    NaN  NaN
      yes   one    NaN  NaN
            three  NaN  NaN
            two    NaN  NaN
True  maybe one    0.0  0.0
            three  0.0  0.0
            two    0.0  0.0
      no    one    NaN  NaN
            three  NaN  NaN
            two    NaN  NaN
      yes   one    NaN  NaN
            three  NaN  NaN
            two    NaN  NaN

我想要的结果是

                   A   B
bool  ans   count
True  yes   one   -3  -8
            two    2  -3
            three  0   2
True  no    one   -5  -3
            two    1  -2
            three  0  -1
True  maybe one    0   0
            two    0   0
            three  0   0
False yes   one    6  -2
            two   -3   8
            three  5   6
False no    one    2   0
            two   -8   2
            three  2  -3
False maybe one    0   0
            two    0   0
            three  0   0

如何告诉 pandas 遵循 boolcount 级别,但忽略 ans 级别?

最佳答案

横截面xs可能是一个不错的选择:

df.sub(
    df.xs('maybe', level=1)
).swaplevel().reindex(df.index)

输出:

                   A  B
bool  ans   count      
True  yes   one   -3 -8
            two    2 -3
            three  0  2
      no    one   -5 -3
            two    1 -2
            three  0 -1
      maybe one    0  0
            two    0  0
            three  0  0
False yes   one    6 -2
            two   -3  8
            three  5  6
      no    one    2  0
            two   -8  2
            three  2 -3
      maybe one    0  0
            two    0  0
            three  0  0

关于python - 使用多索引将数据帧的一部分与另一部分进行比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67203888/

相关文章:

python - pandas:如何在聚合列时跳过行?

python - 如何加载带有自定义损失的keras "history"对象?

python - 服务器错误 500 Wagtail-admin

python - 对于不一致的日期范围,用零填充 pandas groupby

python - 通过正则表达式更改 pandas 列的内容

python - 如何使用 Python 创建化学计量矩阵

python - 如何在 pandas 中按日期绘制数据并同时进行分组

python - 用 `np.datetime64` 对象填充空 Numpy 数组时出错

python - 如何在 django REST 中序列化模型

python - 按 (K,V) 对减少并按 V 排序