python - DateTimeIndex.to_period 为许多偏移量别名引发 ValueError 异常

标签 python python-3.x pandas dataframe

我正在尝试解决一个非常简单的问题,但遇到了困难。 我有一个基于简单数据框的 DateTimeIndex,如下所示:

df=pd.DataFrame(
     index=pd.date_range(
        start='2017-01-01', 
        end='2017-03-04', closed=None), 
     data=np.arange(63), columns=['val']).rename_axis(index='date')

In [179]: df                                                                                                                                                                                                         
Out[179]: 
            val
date           
2017-01-01    0
2017-01-02    1
2017-01-03    2
2017-01-04    3
2017-01-05    4
...         ...
2017-02-28   58
2017-03-01   59
2017-03-02   60
2017-03-03   61
2017-03-04   62

[63 rows x 1 columns] 

我想按周、半月、月等时间段汇总值。 所以我尝试了:

In [180]: df.to_period('W').groupby('date').sum()                                                                                                                                                                    
Out[180]: 
                       val
date                      
2016-12-26/2017-01-01    0
2017-01-02/2017-01-08   28
2017-01-09/2017-01-15   77
2017-01-16/2017-01-22  126
2017-01-23/2017-01-29  175
2017-01-30/2017-02-05  224
2017-02-06/2017-02-12  273
2017-02-13/2017-02-19  322
2017-02-20/2017-02-26  371
2017-02-27/2017-03-05  357

这适用于像 Y、M、D、W、T、S、L、U、N 这样的偏移别名。 但对于 SM、SMS 和此处列出的其他人失败:https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases

它引发了一个 ValueError 异常:

In [181]: df.to_period('SMS').groupby('date').sum()               

--------------------------------------------------------------------------- KeyError                                  Traceback (most recent call
last) pandas/_libs/tslibs/frequencies.pyx in
pandas._libs.tslibs.frequencies._period_str_to_code()

KeyError: 'SMS-15'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call
last) <ipython-input-181-6779559a0596> in <module>
----> 1 df.to_period('SMS').groupby('date').sum()

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/frame.py
in to_period(self, freq, axis, copy)    8350         axis =
self._get_axis_number(axis)    8351         if axis == 0:
-> 8352             new_data.set_axis(1, self.index.to_period(freq=freq))    8353         elif axis == 1:   
8354             new_data.set_axis(0,
self.columns.to_period(freq=freq))

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/accessor.py
in f(self, *args, **kwargs)
     91         def _create_delegator_method(name):
     92             def f(self, *args, **kwargs):
---> 93                 return self._delegate_method(name, *args, **kwargs)
     94 
     95             f.__name__ = name

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py
in _delegate_method(self, name, *args, **kwargs)
    811 
    812     def _delegate_method(self, name, *args, **kwargs):
--> 813         result = operator.methodcaller(name, *args, **kwargs)(self._data)
    814         if name not in self._raw_methods:
    815             result = Index(result, name=self.name)

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/arrays/datetimes.py
in to_period(self, freq)    1280             freq =
get_period_alias(freq)    1281 
-> 1282         return PeriodArray._from_datetime64(self._data, freq, tz=self.tz)    1283     1284     def to_perioddelta(self, freq):

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/arrays/period.py
in _from_datetime64(cls, data, freq, tz)
    273         PeriodArray[freq]
    274         """
--> 275         data, freq = dt64arr_to_periodarr(data, freq, tz)
    276         return cls(data, freq=freq)
    277 

~/.virtualenvs/py36/lib/python3.6/site-packages/pandas/core/arrays/period.py
in dt64arr_to_periodarr(data, freq, tz)
    914         data = data._values
    915 
--> 916     base, mult = libfrequencies.get_freq_code(freq)
    917     return libperiod.dt64arr_to_periodarr(data.view("i8"), base, tz), freq
    918 

pandas/_libs/tslibs/frequencies.pyx in
pandas._libs.tslibs.frequencies.get_freq_code()

pandas/_libs/tslibs/frequencies.pyx in
pandas._libs.tslibs.frequencies.get_freq_code()

pandas/_libs/tslibs/frequencies.pyx in
pandas._libs.tslibs.frequencies.get_freq_code()

pandas/_libs/tslibs/frequencies.pyx in
pandas._libs.tslibs.frequencies._period_str_to_code()

ValueError: Invalid frequency: SMS-15

我正在使用 python 3.6.5,pandas 版本“0.25.1”

最佳答案

使用DataFrame.resample这里:

print (df.resample('W').sum())
            val
date           
2017-01-01    0
2017-01-08   28
2017-01-15   77
2017-01-22  126
2017-01-29  175
2017-02-05  224
2017-02-12  273
2017-02-19  322
2017-02-26  371
2017-03-05  357

print (df.resample('SM').sum())
            val
date           
2016-12-31   91
2017-01-15  344
2017-01-31  555
2017-02-15  663
2017-02-28  300

print (df.resample('SMS').sum())
            val
date           
2017-01-01   91
2017-01-15  374
2017-02-01  525
2017-02-15  721
2017-03-01  242

groupbyGrouper 的替代方案:

print (df.groupby(pd.Grouper(freq='W')).sum())
print (df.groupby(pd.Grouper(freq='SM')).sum())
print (df.groupby(pd.Grouper(freq='SMS')).sum())

关于python - DateTimeIndex.to_period 为许多偏移量别名引发 ValueError 异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58064824/

相关文章:

python - 手动投影坐标类似于python中的gluLookAt

python-3.x - Python3 GZip 压缩字符串

python - 如何在 Pandas 的数据框上使用 Excel 内置格式(会计格式)

python - 检查文件系统在 Python 中是否不区分大小写

python - 如何将数据帧的值复制到另一个数据帧的最后一列/行

python - 在pygtk中隐藏窗口标题栏

python - 使用 pyexcelerate 的 Dataframe 打印索引

Python3 randrange 给出相同的结果

python - 从元组列表中提取元组值

python - 我在 tf.contrib.learn.LinearClassifier.fit 中传递什么作为 x 和 y 参数