python - Pandas unstack 不应对剩余索引进行排序

标签 python pandas sorting

我在问自己是否有可能取消堆叠多索引数据帧的一层,以便不对返回的数据帧的其余索引进行排序! 代码示例:

arrays = [["room1", "room1", "room1", "room1", "room1", "room1",
           "room2", "room2", "room2", "room2", "room2", "room2"],
          ["bed1", "bed1", "bed1", "bed2", "bed2", "bed2",
           "bed1", "bed1", "bed1", "bed2", "bed2", "bed2"],
          ["blankets", "pillows", "all", "blankets", "pillows", "all",
           "blankets", "pillows", "all", "blankets", "pillows", "all"]]

tuples = list(zip(*arrays))

index = pd.MultiIndex.from_tuples(tuples, names=['first index', 
                                                 'second index', 'third index'])

series = pd.Series([1, 2, 3, 1, 1, 2, 2, 2, 4, 2, 1, 3 ], index=index)

series

first index  second index  third index
room1        bed1          blankets       1
                           pillows        2
                           all            3
             bed2          blankets       1
                           pillows        1
                           all            2
room2        bed1          blankets       2
                           pillows        2
                           all            4
             bed2          blankets       2
                           pillows        1
                           all            3

拆开第二个索引:

series.unstack(1)

second index             bed1  bed2
first index third index            
room1       all             3     2
            blankets        1     1
            pillows         2     1
room2       all             4     3
            blankets        2     2
            pillows         2     1

问题是第三个索引的顺序发生了变化,因为该索引是按字母顺序自动排序的。现在,'all' 行是'blankets' 和'pillow' 行的总和,是第一行而不是最后一行。那么如何解决这个问题呢?似乎没有一个选项可以阻止“unstack”自动排序。此外,似乎不可能通过像 myDataFrame.sort_index(..., key=['some_key']) 这样的键对数据帧的索引进行排序。

最佳答案

一个可能的解决方案是 reindexreindex_axis带参数 level=1:

s = series.unstack(1).reindex(['blankets','pillows','all'], level=1)
print (s)
second index             bed1  bed2
first index third index            
room1       blankets        1     1
            pillows         2     1
            all             3     2
room2       blankets        2     2
            pillows         2     1
            all             4     3

s = series.unstack(1).reindex_axis(['blankets','pillows','all'], level=1)
print (s)
second index             bed1  bed2
first index third index            
room1       blankets        1     1
            pillows         2     1
            all             3     2
room2       blankets        2     2
            pillows         2     1
            all             4     3

更动态的解决方案:

a = series.index.get_level_values('third index').unique()
print (a)
Index(['blankets', 'pillows', 'all'], dtype='object', name='third index')

s = series.unstack(1).reindex_axis(a, level=1)
print (s)
second index             bed1  bed2
first index third index            
room1       blankets        1     1
            pillows         2     1
            all             3     2
room2       blankets        2     2
            pillows         2     1
            all             4     3

关于python - Pandas unstack 不应对剩余索引进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45434630/

相关文章:

algorithm - 随机快速排序

python - Heroku -- 导入错误 : No module named Crypto

python - 在python中生成任意长度数字的升序列表

python - numpy argmax 如何工作?

python - 大阵列 3D 中两点之间的距离

python - Pandas - 提取以特定字符开头的字符串

python - 根据运算结果对元组列表进行排序(除法)

python - 如何访问 python groupby 对象值

python - 查找 Pandas 中列的标准差,其中每个元素都是 numpy 数组

java - 冒泡排序和选择排序