python - 使用python从文本中提取最后一段

我正在分析服务台工单，我需要从评论栏中提取第一个时间戳。也就是说，我需要知道服务台分析师第一次与工单交互的日期和时间。我使用了 datefinder.find_dates() 函数，它运行得相当好，但我有一些非常技术性的票证评论，使用了大量的数字和 IUP 地址，这似乎混淆了 datefinder。 find_dates() 函数，很多时候它只是吐出不相关的数据。我曾尝试搜索有关该功能的教程，但没有一个有用的，因为该功能似乎不是很流行。我还找到了this和 this SOF 问题，但它们没有解决我的问题。因为 datefinder.find_dates() 在文本中有大量数字数据时无法正常工作，所以唯一的其他选择是能够从每个观察的最后一段中提取时间戳总是位于最后一段的开头，但我自己似乎做不到，所以我在问。

这里是大部分数据布局的片段:

2019-04-10 12:43:54 - Andras Eger (Work notes)
Sim life cycle attached

2019-04-09 17:25:38 - Timea Magyar (Additional comments)
Thank you for contacting us.
We confirm that we have received your email and we are processing the 
case.
As soon as we get any update from the resolver team, we will inform you.

2019-04-09 17:25:25 - Timea Magyar (Work notes)
VTIS: INC000033296089

2019-04-09 17:22:10 - Timea Magyar (Work notes)
This New Incident was raised on behalf of Daniel Orejuela from [code]<a 
href='new_call.do?sys_id=0b580c90dbf837404cd858a5dc961989&
sysparm_stack=new_call_list.do?sysparm_query=active=true'>CALL0109649</a>
[/code][code]<br><p><span>Call Notes

所以主要问题是: 如何提取每个观察的最后一段的日期和时间？在这种情况下，输出应该是:

2019-04-09 17:22:10

最佳答案

首先将您的输入拆分为 \n\n，使用列表中的最后一个结果，然后使用正则表达式。

text = "..."

import re

last_paragraph = text.split("\n\n")[-1]

result = re.findall("[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}",last_paragraph)[0]

print (result)

结果:

2019-04-09 17:22:10

关于python - 使用python从文本中提取最后一段，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56423982/

python - 使用python从文本中提取最后一段

上一篇：python - 如何使用 Python 根据有关日期值的条件删除重复项？

下一篇：python - python 如何在乘法后对 float32 数字进行舍入？