python - Pivot_table 到列的多重索引

我有下表:

In [303]: table.head()
Out[303]: 
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

我想创建一个简单的pd.DataFrame，其中:

列 = [1, 2, ..., 52] (weekofyear)
行 = [0, 1, ..., 6](工作日)
值 = np.sum

我尝试使用 pd.pivot_table 这给了我预期的结果:

In [308]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values=["people"], aggfunc=[np.sum])
     ...: p
     ...: 
Out[308]: 
              sum                                             ...             \
           people                                             ...              
weekofyear     1    2    3    4    5    6    7    8   9    10 ...    43   44   
weekday                                                       ...              
0             162   86   84   95   92   98  108  102  97   87 ...   108   86   
1              95  113   88   78  108  112   98  104  87  105 ...    85   82   
2             102   70   93   82  103   80  103   85  82   96 ...    87  105   
3              87   91  101   83   91  100  100   80  89   86 ...    87   91   
4             111   91  110  103   93  116  110   99  78   77 ...    83  102   
5             117  107   99   88   97   90  100   91  97   88 ...   103  110   
6              92   95   90   86   91  103   98  100  89   96 ...    94  101   



weekofyear   45   46   47   48   49   50   51   52  
weekday                                             
0            99   92   99   83  107  106   93  107  
1           105   83  101   93  102   89  113   84  
2            96   84  110   83  104   84   84  116  
3            87   96   87   88   88   83  113   93  
4            93   81  104  108   72  101  109   97  
5            81  107   97   89   86  108  113  101  
6            93   92   93   91   89   96   93  226  

[7 rows x 52 columns]

但是我没有拥有我的weekofyears列，而是陷入了无法摆脱的多重索引。如下图:

In [309]: p.columns
Out[309]: 
MultiIndex(levels=[['sum'], ['people'], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]],
           names=[None, None, 'weekofyear']

虽然索引看起来不错:

In [311]: p.index
Out[311]: Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64', name='weekday'

我尝试使用 unstack() 和 reset_index() 函数，但没有成功。

我错过了什么吗？

最佳答案

您应该尝试为它们提供单个值，而不是为 values 和 aggfunc 提供列表。示例 -

p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

演示 -

In [3]: table
Out[3]:
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

In [12]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

In [13]: p
Out[13]:
weekofyear   1    52
weekday
0            76  NaN
1            95  NaN
2           102  NaN
3            87  NaN
6           NaN  119

In [14]: p.columns
Out[14]: Int64Index([1, 52], dtype='int64', name='weekofyear')

来自documentation -

aggfunc : function, default numpy.mean, or list of functions
If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

与 values 的情况类似，尽管文档中没有具体提及

关于python - Pivot_table 到列的多重索引，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32654495/

python - Pivot_table 到列的多重索引

上一篇：python - 使用 Python 批量插入 vertica

下一篇：Python:使用 readline 时串行超时不起作用