我有一个 txt 文件,格式如下:
Event A 15MAR18 103000 15MAR18 103758
Event A 16MAR18 120518 16MAR18 121308
Event B 16MAR18 121203 16MAR18 124543
Event B 16MAR18 134443 16MAR18 141823
Event B 16MAR18 151733 16MAR18 155103
Event B 17MAR18 165013 17MAR18 172343
Event B 17MAR18 182253 17MAR18 185623
Event B 17MAR18 195533 17MAR18 202903
Event A 17MAR18 203738 17MAR18 204028
Event B 18MAR18 212813 18MAR18 220143
Event A 18MAR18 221058 18MAR18 222338
Event B 18MAR18 230103 18MAR18 233423
Event A 19MAR18 234728 19MAR18 000048
Event B 20MAR18 003343 20MAR18 010703
Event A 20MAR18 012508 20MAR18 013418
Event B 21MAR18 020623 21MAR18 023943
Event B 21MAR18 033903 21MAR18 041223
Event B 21MAR18 051143 21MAR18 054503
Event B 21MAR18 064433 21MAR18 071743
Event A 22MAR18 074058 22MAR18 075008
Event B 22MAR18 081713 22MAR18 085023
Event A 23MAR18 091438 23MAR18 092738
Event B 23MAR18 094953 23MAR18 102303
Event A 23MAR18 105148 23MAR18 110418
我正在尝试根据中间列的 24 小时时间差来分隔文件。
例如第一行 15MAR18 103000 将是它自己的单独列表
那么第二行将是一个不同的列表,因为 timedelta > 24 小时。 16MAR18 120518 到 16MAR18 151733 会被归为一组。等等...
我的尝试如下:
List_Segment_1 = []
with open('file.txt', 'r') as input_file:
input_file = input_file.readlines()
startTime = datetime.strptime(input_file[0][15:29], '%d%b%y %H%M%S')
endTime = startTime + timedelta(hours=24)
for line in input_file:
dates= datetime.strptime(line[15:29], '%d%b%y %H%M%S')
if startTime < dates < endTime:
List_Segment_1.append(line)
我不知道如何处理其余的行...只有第一个“段”...真正的 txt 文件中有数百行...也许有更好的分段方法数据与某物的字典?
感谢帮助。理想情况下没有 pandas 或任何扩展库
输出应该如下:
Event A 15MAR18 103000 15MAR18 103758 Segment1
Event A 16MAR18 120518 16MAR18 121308 Segment2
Event B 16MAR18 121203 16MAR18 124543 Segment2
Event B 16MAR18 134443 16MAR18 141823 Segment2
Event B 16MAR18 151733 16MAR18 155103 Segment2
Event B 17MAR18 165013 17MAR18 172343 Segment3
Event B 17MAR18 182253 17MAR18 185623 Segment3
Event B 17MAR18 195533 17MAR18 202903 Segment3
Event A 17MAR18 203738 17MAR18 204028 Segment3
Event B 18MAR18 212813 18MAR18 220143 Segment4
Event A 18MAR18 221058 18MAR18 222338 Segment4
Event B 18MAR18 230103 18MAR18 233423 Segment4
Event A 19MAR18 234728 19MAR18 000048 Segment5
Event B 20MAR18 003343 20MAR18 010703 Segment5
Event A 20MAR18 012508 20MAR18 013418 Segment5
Event B 21MAR18 020623 21MAR18 023943 Segment6
Event B 21MAR18 033903 21MAR18 041223 Segment6
Event B 21MAR18 051143 21MAR18 054503 Segment6
Event B 21MAR18 064433 21MAR18 071743 Segment6
Event A 22MAR18 074058 22MAR18 075008 Segment6
Event B 22MAR18 081713 22MAR18 085023 Segment7
Event A 23MAR18 091438 23MAR18 092738 Segment8
Event B 23MAR18 094953 23MAR18 102303 Segment8
Event A 23MAR18 105148 23MAR18 110418 Segment8
最佳答案
这是您问题的简单实现,您应该根据需要修改它:
from datetime import datetime, timedelta
with open('file.txt', 'r') as input_file:
lines = input_file.readlines()
base_time = datetime.strptime(lines[0][14:28], '%d%b%y %H%M%S')
end_time = base_time + timedelta(hours=24)
segment = 1
for line in lines:
date = datetime.strptime(line[14:28], '%d%b%y %H%M%S')
if base_time <= date < end_time:
pass
else:
segment += 1
base_time = date
end_time = date + timedelta(hours=24)
print(line.strip() + '\tSegment {}'.format(segment))
这段代码输出:
Event A 15MAR18 103000 15MAR18 103758 Segment 1
Event A 16MAR18 120518 16MAR18 121308 Segment 2
Event B 16MAR18 121203 16MAR18 124543 Segment 2
Event B 16MAR18 134443 16MAR18 141823 Segment 2
Event B 16MAR18 151733 16MAR18 155103 Segment 2
Event B 17MAR18 165013 17MAR18 172343 Segment 3
Event B 17MAR18 182253 17MAR18 185623 Segment 3
Event B 17MAR18 195533 17MAR18 202903 Segment 3
Event A 17MAR18 203738 17MAR18 204028 Segment 3
Event B 18MAR18 212813 18MAR18 220143 Segment 4
Event A 18MAR18 221058 18MAR18 222338 Segment 4
Event B 18MAR18 230103 18MAR18 233423 Segment 4
Event A 19MAR18 234728 19MAR18 000048 Segment 5
Event B 20MAR18 003343 20MAR18 010703 Segment 5
Event A 20MAR18 012508 20MAR18 013418 Segment 5
Event B 21MAR18 020623 21MAR18 023943 Segment 6
Event B 21MAR18 033903 21MAR18 041223 Segment 6
Event B 21MAR18 051143 21MAR18 054503 Segment 6
Event B 21MAR18 064433 21MAR18 071743 Segment 6
Event A 22MAR18 074058 22MAR18 075008 Segment 7
Event B 22MAR18 081713 22MAR18 085023 Segment 7
Event A 23MAR18 091438 23MAR18 092738 Segment 8
Event B 23MAR18 094953 23MAR18 102303 Segment 8
Event A 23MAR18 105148 23MAR18 110418 Segment 8
关于python - 根据时间戳将 txt 文件数据分割成 24 小时的 block ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49452063/