对 python(Pandas)有点陌生,请帮我解决这个问题
这就是我的数据框的样子:- Device_id 是在时间(1524724677)显示(消息)的设备的 ID,时间以纪元为单位。
Device_Id Msg Time
0 ABC123 connected 1524724677
1 ABC123 connected 1524724679
2 XYZ123 device failed 1524724814
3 ABC123 connected 1524725279
4 XVZ123 device failed 1524725300
5 PQR123 error 1524725325
6 ABC123 connected 1524725345
我必须对数据帧的每一行执行操作,以便我可以添加一些新列。
我想要的数据框看起来像:-
Device_Id Msg Time count
0 ABC123 connected 1524724677 1
1 ABC123 connected 1524724679 2
2 XYZ123 device failed 1524724814 1
3 ABC123 connected 1524725279 1
4 XVZ123 device failed 1524725300 1
5 PQR123 error 1524725325 1
6 ABC123 connected 1524725345 2
此计数列的工作方式与例如:
请阅读所有要点,以明确计数列的工作原理
--for row(0) count is (1), bcoz this is the unique device
--we will increase the counter w.r.t (Time)
--we will reset the counter values after every 10 minutes
--for row(1) count is (2), bcoz time (1524724679) is between
1524724677 and 1524724677 + 10 minutes.
--for row(2), it is unique device and time(1524724679)
between 1524724677 and 1524724677 + 10 minutes so count is (1).
--for row(3), notice it is not unique device then also it has count=1
bcoz, time(1524725279) is not between 1524724677 and 1524724677 + 10
minutes. (Count reset)
--for col(4) count is (1), bcoz time (1524725300) is between
1524725279 and 1524725279 + 10 minutes.
--for col(5), count=1, unique device and time (1524725325) between 1524725279
and 1524725279 + 10 minutes.
--for col(6) count=2, bcoz time(1524725345) is between 1524725279
and 1524725279 + 10 minutes.
计数值每 10 分钟重置一次,这意味着每个 device_id 将从 (1) 开始。
每 10 分钟后,每个唯一的 device_id 将被视为新的,这就是为什么计数重新从 1 开始并在接下来的 10 分钟内保持其值。
最佳答案
您可以使用 groupby 和 grouper
函数可以轻松解决这个问题:
# convert time
df['Time'] = pd.to_datetime(df['Time'], unit='s')
# get output
df['count'] = df.groupby(['Device_Id', pd.Grouper(key='Time', freq='10min')]).cumcount()+1
print(df)
Device_Id Msg Time count
0 ABC123 connected 2018-04-26 06:37:57 1
1 ABC123 connected 2018-04-26 06:37:59 2
2 XYZ123 device failed 2018-04-26 06:40:14 1
3 ABC123 connected 2018-04-26 06:47:59 1
4 XVZ123 device failed 2018-04-26 06:48:20 1
5 PQR123 error 2018-04-26 06:48:45 1
6 ABC123 connected 2018-04-26 06:49:05 2
关于python - 我如何检查数据框中行之间的相似性并添加一列作为计数器和增量。当行匹配时呢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50212175/