我有一个数据框,其中包含堆积的每月值,如下所示:
Value Month
0 0.09187 Jan
1 0.72878 Feb
2 0.92052 Mar
3 -1.86845 Apr
4 -1.16489 May
5 -0.61433 Jun
6 0.68008 Jul
7 -1.50555 Aug
8 -0.18985 Sep
9 -1.11380 Oct
10 -0.63838 Nov
11 0.37527 Dec
12 0.234216 Jan
我想使用已知范围添加一列年份,以便 df 看起来像:
Value Month Year
0 0.09187 Jan 1950
1 0.72878 Feb 1950
2 0.92052 Mar 1950
3 -1.86845 Apr 1950
4 -1.16489 May 1950
5 -0.61433 Jun 1950
6 0.68008 Jul 1950
7 -1.50555 Aug 1950
8 -0.18985 Sep 1950
9 -1.11380 Oct 1950
10 -0.63838 Nov 1950
11 0.37527 Dec 1950
12 0.234216 Jan 1951
我尝试初始化年份列表以应用于该列:
years = list(range(1950, 2000)
df['Year'] = years * 12
但这产生了
Value Month Year
0 0.09187 Jan 1950
1 0.72878 Feb 1951
2 0.92052 Mar 1952
等等。我一直无法想出任何其他方法
最佳答案
只要您知道您拥有所有年份的 Jan
数据,您就可以执行以下操作:
df['Year'] = df['Month'].eq('Jan').cumsum()+1949
>>> df
Value Month Year
0 0.091870 Jan 1950
1 0.728780 Feb 1950
2 0.920520 Mar 1950
3 -1.868450 Apr 1950
4 -1.164890 May 1950
5 -0.614330 Jun 1950
6 0.680080 Jul 1950
7 -1.505550 Aug 1950
8 -0.189850 Sep 1950
9 -1.113800 Oct 1950
10 -0.638380 Nov 1950
11 0.375270 Dec 1950
12 0.234216 Jan 1951
或者,您可以遵循原来的逻辑,但使用np.repeat
:
import numpy as np
years = list(range(1950, 2000))
df['Year'] = np.repeat(years,12)
或者另一种选择:
df['Year'] = pd.date_range('1950-01-01',periods=len(df),freq='m').year
关于python - 添加重复序列值的列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52434128/