java - 如何使用正则表达式收集日志的不同部分

标签 java regex

我需要从日志中收集信息,不幸的是这些信息没有放在一起,但之间还有其他条目。

例如,我想知道谁是 child 的 parent 。日志看起来像

[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.
[Mar-27-2019 20:17:32]*** Started adoption of Ninon Caron for Jacqueline Leduc and Don Lothario.
[Mar-27-2019 20:17:32]*** Started adoption of Emile François for Marion Boyer and Paolo Rocca.
[Mar-27-2019 20:17:32]Started 4 pregnancies
[Mar-27-2019 20:17:32]*** Started pet pregnancy for Josie with Bartholomiaou A. Bittlebun Senior.
[Mar-27-2019 20:17:32]*** Started pet pregnancy for Blue with Tempête Romeo.
[Mar-27-2019 20:17:32]Started 2 pet pregnancies
[Mar-27-2019 20:17:32]Checking for random marriage
(...)
[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Zélie Landgraab
[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Jessica Goth
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.

所以我需要一起收集的是:

[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.
[Mar-28-2019 09:54:54]   Female delivered:
[Mar-28-2019 09:54:54]   * Jessica Goth
[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.

在Java中有什么简单的方法可以做到这一点吗?

最佳答案

例如,我们可以设计一些表达式来查看哪些单词可能位于所需的行中,例如:

^(?=.*(?:delivered|\*\*\*\s+Started\s+pregnancy)).*$

然后我们会收集这些行。

演示

该表达式在 this demo 的右上角面板中进行了解释如果您想探索/简化/修改它。

测试

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "^(?=.*(?:delivered|\\*\\*\\*\\s+Started\\s+pregnancy)).*$";
final String string = "[Mar-27-2019 20:17:32]*** Started pregnancy for Bella Goth with Vladimir Goth.\n"
     + "[Mar-27-2019 20:17:32]*** Started adoption of Ninon Caron for Jacqueline Leduc and Don Lothario.\n"
     + "[Mar-27-2019 20:17:32]*** Started adoption of Emile François for Marion Boyer and Paolo Rocca.\n"
     + "[Mar-27-2019 20:17:32]Started 4 pregnancies\n"
     + "[Mar-27-2019 20:17:32]*** Started pet pregnancy for Josie with Bartholomiaou A. Bittlebun Senior.\n"
     + "[Mar-27-2019 20:17:32]*** Started pet pregnancy for Blue with Tempête Romeo.\n"
     + "[Mar-27-2019 20:17:32]Started 2 pet pregnancies\n"
     + "[Mar-27-2019 20:17:32]Checking for random marriage\n"
     + "(...)\n"
     + "[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]   Female delivered:\n"
     + "[Mar-28-2019 09:54:54]   * Zélie Landgraab\n"
     + "[Mar-28-2019 09:54:54]Nancy Landgraab delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.\n"
     + "[Mar-28-2019 09:54:54]   Female delivered:\n"
     + "[Mar-28-2019 09:54:54]   * Jessica Goth\n"
     + "[Mar-28-2019 09:54:54]Bella Goth delivered 1 baby.";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("Group " + i + ": " + matcher.group(i));
    }
}

正则表达式电路

jex.im可视化正则表达式:

enter image description here

关于java - 如何使用正则表达式收集日志的不同部分,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57078524/

相关文章:

regex - symfony2 中的正则表达式安全模式

regex - 如何在 AWS Lambda 上导入正则表达式

java - 创建缩略图的正确方法是什么?

java - 将 Java 文件转换为 Kotlin 现在无法编译 - "Internal compiler error"

javafx检查场景中是否存在对象

python - 如何用不同的表达式格式化原始字符串?

python - 从正则表达式中的文件名解析日期

java - 如何解决 "Cannot resolve constructor ' XSSFColor(颜色 )'"?

java - 将 C 中带有指针的代码转换为 Java 代码

java -\s+ 不匹配所有空格?