假设我有以下数据框,
ID StationID Date ParamName ParamValue
0 A 1990-01-08 metal 0.5
1 A 1990-01-08 wood 1.4
2 A 1990-01-08 glass 9.7
3 B 1990-01-08 metal 0.8
4 B 1990-01-08 wood 4.8
5 C 1990-01-08 metal 0.6
6 A 1990-02-03 metal 0.5
7 A 1990-03-01 metal 1.2
8 B 1990-03-01 metal 0.9
9 C 1990-03-01 metal 1.1
如何重新索引数据帧的 ID 列,以仅在 date
或 StationID
不同时递增。如何将上面的数据帧重新索引到下面的数据帧(假设日期包含日期时间对象)?
ID StationID Date ParamName ParamValue
0 A 1990-01-08 metal 0.5
0 A 1990-01-08 wood 1.4
0 A 1990-01-08 glass 9.7
1 B 1990-01-08 metal 0.8
1 B 1990-01-08 wood 4.8
2 C 1990-01-08 metal 0.6
3 A 1990-02-03 metal 0.5
4 A 1990-03-01 metal 1.2
5 B 1990-03-01 metal 0.9
6 C 1990-03-01 metal 1.1
最佳答案
这是您需要的吗?
df.assign(ID=(df.StationID!=df.StationID.shift()).cumsum()-1)
Out[151]:
ID StationID Date ParamName ParamValue
0 0 A 1990-01-08 metal 0.5
1 0 A 1990-01-08 wood 1.4
2 0 A 1990-01-08 glass 9.7
3 1 B 1990-01-08 metal 0.8
4 1 B 1990-01-08 wood 4.8
5 2 C 1990-01-08 metal 0.6
6 3 A 1990-02-03 metal 0.5
7 3 A 1990-02-03 wood 1.2
8 4 B 1990-02-03 metal 0.9
9 5 C 1990-02-03 metal 1.1
更新:-)
df['ID']=df.StationID+df.Date.astype(str)
df.assign(ID=(df.ID!=df.ID.shift()).cumsum()-1)
Out[163]:
ID StationID Date ParamName ParamValue
0 0 A 1990-01-08 metal 0.5
1 0 A 1990-01-08 wood 1.4
2 0 A 1990-01-08 glass 9.7
3 1 B 1990-01-08 metal 0.8
4 1 B 1990-01-08 wood 4.8
5 2 C 1990-01-08 metal 0.6
6 3 A 1990-02-03 metal 0.5
7 4 A 1990-03-01 metal 1.2
8 5 B 1990-03-01 metal 0.9
9 6 C 1990-03-01 metal 1.1
关于python - 根据日期和列值重新索引 Pandas 数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48859640/