python - Pandas 无法从 Numpy 时间戳数组创建 DataFrame

标签 python arrays numpy pandas dataframe

我有一个 Pandas 时间戳的 numpy 数组:

array([[Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T')],
       [Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T')],
       [Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'),
        Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T')]], dtype=object)

我无法从此数组创建 DataFrame,因为尝试这样做会引发以下错误:

AssertionError: Number of Block dimensions (1) must equal number of axes (2)

您可以看到该数组显然是二维的,我使用 ndim 验证了这一点。

为什么我无法创建 DataFrame?

最佳答案

我认为你可以使用列表理解:

import pandas as pd
import numpy as np

a =np.array([[pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 15:50:00+0000', tz='UTC', offset='5T')],
       [pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 17:10:00+0000', tz='UTC', offset='5T')],
       [pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T'),
        pd.Timestamp('2016-05-02 20:25:00+0000', tz='UTC', offset='5T')]], dtype=object)

df = pd.DataFrame([x for x in a], columns=['a','b','c'])
print (df)
                          a                         b  \
0 2016-05-02 15:50:00+00:00 2016-05-02 15:50:00+00:00   
1 2016-05-02 17:10:00+00:00 2016-05-02 17:10:00+00:00   
2 2016-05-02 20:25:00+00:00 2016-05-02 20:25:00+00:00   

                          c  
0 2016-05-02 15:50:00+00:00  
1 2016-05-02 17:10:00+00:00  
2 2016-05-02 20:25:00+00:00  

另一个解决方案是 DataFrame.from_records :

print (pd.DataFrame.from_records(a, columns=['a','b','c']))
                          a                         b  \
0 2016-05-02 15:50:00+00:00 2016-05-02 15:50:00+00:00   
1 2016-05-02 17:10:00+00:00 2016-05-02 17:10:00+00:00   
2 2016-05-02 20:25:00+00:00 2016-05-02 20:25:00+00:00   

                          c  
0 2016-05-02 15:50:00+00:00  
1 2016-05-02 17:10:00+00:00  
2 2016-05-02 20:25:00+00:00  

参见alternate constructors of df .

关于python - Pandas 无法从 Numpy 时间戳数组创建 DataFrame,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37445334/

相关文章:

arrays - Fortran 2008 - 类中的数组变量

arrays - 如何在二维数组中查找元素的索引 - Swift

c - 为什么数组不接受空终止符?

python - 将向量组合为 numpy 中的列矩阵

python - Numpy 数组中的宾果游戏

python - 什么是 numpy.fft.rfft 和 numpy.fft.irfft 及其在 MATLAB 中的等效代码

python - 如何礼貌地进行大量的api调用?

python - 如何转换转义字符?

python - 报告生成器的建议(Python 或 Web 服务)

Python Pandas 根据索引值创建列