python - Pivot_table 到列的多重索引

标签 python pandas pivot-table multi-index

我有下表:

In [303]: table.head()
Out[303]: 
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

我想创建一个简单的pd.DataFrame,其中:

  • 列 = [1, 2, ..., 52] (weekofyear)
  • 行 = [0, 1, ..., 6](工作日)
  • 值 = np.sum

我尝试使用 pd.pivot_table 这给了我预期的结果:

In [308]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values=["people"], aggfunc=[np.sum])
     ...: p
     ...: 
Out[308]: 
              sum                                             ...             \
           people                                             ...              
weekofyear     1    2    3    4    5    6    7    8   9    10 ...    43   44   
weekday                                                       ...              
0             162   86   84   95   92   98  108  102  97   87 ...   108   86   
1              95  113   88   78  108  112   98  104  87  105 ...    85   82   
2             102   70   93   82  103   80  103   85  82   96 ...    87  105   
3              87   91  101   83   91  100  100   80  89   86 ...    87   91   
4             111   91  110  103   93  116  110   99  78   77 ...    83  102   
5             117  107   99   88   97   90  100   91  97   88 ...   103  110   
6              92   95   90   86   91  103   98  100  89   96 ...    94  101   



weekofyear   45   46   47   48   49   50   51   52  
weekday                                             
0            99   92   99   83  107  106   93  107  
1           105   83  101   93  102   89  113   84  
2            96   84  110   83  104   84   84  116  
3            87   96   87   88   88   83  113   93  
4            93   81  104  108   72  101  109   97  
5            81  107   97   89   86  108  113  101  
6            93   92   93   91   89   96   93  226  

[7 rows x 52 columns]

但是我没有拥有我的weekofyears列,而是陷入了无法摆脱的多重索引。如下图:

In [309]: p.columns
Out[309]: 
MultiIndex(levels=[['sum'], ['people'], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]],
           names=[None, None, 'weekofyear']

虽然索引看起来不错:

In [311]: p.index
Out[311]: Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64', name='weekday'  

我尝试使用 unstack()reset_index() 函数,但没有成功。

我错过了什么吗?

最佳答案

您应该尝试为它们提供单个值,而不是为 valuesaggfunc 提供列表。示例 -

p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

演示 -

In [3]: table
Out[3]:
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

In [12]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

In [13]: p
Out[13]:
weekofyear   1    52
weekday
0            76  NaN
1            95  NaN
2           102  NaN
3            87  NaN
6           NaN  119

In [14]: p.columns
Out[14]: Int64Index([1, 52], dtype='int64', name='weekofyear')

来自documentation -

aggfunc : function, default numpy.mean, or list of functions
If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

values 的情况类似,尽管文档中没有具体提及

关于python - Pivot_table 到列的多重索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32654495/

相关文章:

python - 修改 Python 类

python - 使用掩码估算混合类型 DataFrame 子集中的缺失值

python - Pandas 时间分组 : Boundaries for the grouping

python - Pandas中水平条形图的修改

python - 解析包含JSON和文本结构的txt文件中的JSON结构

python - 如何设置二级多指标系列的所有值?

python - Bokeh 图未在 nbviewer 中显示

mysql - 带有日志数据的复杂 SQL PIVOT

javascript - 预过滤 JS PIVOT TABLE 字段

sql - sql server中具有多列的动态数据透视表