我正在尝试从 pandas TimeSeries 中删除所有“旧”值,例如所有超过 1 天的值(相对于最新值)。
天真地,我尝试了这样的事情:
from datetime import timedelta
def trim(series):
return series[series.index.max() - series.index < timedelta(days=1)]
报错:
TypeError: ufunc 'subtract' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule 'safe'
显然,问题出在这个表达式上:series.index.max() - series.index
然后我发现了这个作品:
def trim(series):
return series[series.index > series.index.max() - timedelta(days=1)]
有人可以解释为什么后者有效而前者引发错误吗?
编辑:我正在使用 pandas 版本 0.12.0
最佳答案
这是 0.13 中的示例(to_timedelta
在 0.12 中不可用,因此
你必须这样做np.timedelta64(4,'D')
)
In [12]: rng = pd.date_range('1/1/2011', periods=10, freq='D')
In [13]: ts = pd.Series(randn(len(rng)), index=rng)
In [14]: ts
Out[14]:
2011-01-01 -0.348362
2011-01-02 1.782487
2011-01-03 1.146537
2011-01-04 -0.176308
2011-01-05 -0.185240
2011-01-06 1.767135
2011-01-07 0.615911
2011-01-08 2.459799
2011-01-09 0.718081
2011-01-10 -0.520741
Freq: D, dtype: float64
In [15]: x = ts.index.to_series().max()-ts.index.to_series()
In [16]: x
Out[16]:
2011-01-01 9 days
2011-01-02 8 days
2011-01-03 7 days
2011-01-04 6 days
2011-01-05 5 days
2011-01-06 4 days
2011-01-07 3 days
2011-01-08 2 days
2011-01-09 1 days
2011-01-10 0 days
Freq: D, dtype: timedelta64[ns]
In [17]: x[x>pd.to_timedelta('4 days')]
Out[17]:
2011-01-01 9 days
2011-01-02 8 days
2011-01-03 7 days
2011-01-04 6 days
2011-01-05 5 days
Freq: D, dtype: timedelta64[ns]
关于python - 按 timedelta 修剪 TimeSeries,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/21252904/