Java正则表达式匹配多行

标签 java regex pattern-matching

以下是应应用正则表达式的数据示例:

2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Filter -> Map (1/1) (824780055001546646d35df7a64cfe3c) switched from CANCELING to CANCELED.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Try to restart or fail the job  (3064130e1dccead0b037f193d3699c3b) if no longer possible.
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Could not restart the job  (3064130e1dccead0b037f193d3699c3b) because the restart strategy prevented it.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job 3064130e1dccead0b037f193d3699c3b.
2019-05-27 10:49:18,418 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
2019-05-27 10:49:18,419 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Job 3064130e1dccead0b037f193d3699c3b reached globally terminal state FAILED.

基本上我想提取的是时间戳和带有消息的错误:

举个例子:

TimeStamp               Error
2019-05-27 10:49:18,418 ERROR  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job  (3064130e1dccead0b037f193d3699c3b) switched from state FAILING to FAILED.
java.lang.IllegalArgumentException: json can not be null or empty
    at com.jayway.jsonpath.internal.Utils.notEmpty(Utils.java:256)
    at com.jayway.jsonpath.JsonPath.compile(JsonPath.java:424)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.validateJsonPath(ControlData.java:194)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:178)
    at com.mypkg.subpkg.ControlData$ConnectedStreams.flatMap1(ControlData.java:171)
    at org.apache.flink.streaming.api.operators.co.CoStreamFlatMap.processElement1(CoStreamFlatMap.java:53)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
    at org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)

这里错误消息被分成多行,为此我编写了如下 java 模式:

((?m)\\d{4}-[01]\\d-[0-3]\\d\\s[0-2]\\d((:[0-5]\\d)?){2}[\\s\\S]*ERROR[\\s\\S]*[ ]*at [\\s\\S]*)

但它返回了我文件的所有内容。

我应该怎么做才能让它工作,这样它也会给我多行错误消息。

最佳答案

试试这个

((\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5})\sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5}))

解释:

  • (\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3,5 }) - 匹配时间戳
  • \sERROR.+?(?=\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2} ,\d{3,5}) - 是否进行非贪婪匹配,直到找到下一个时间戳(正向先行)
  • 此外,我想强调的是,在使用此正则表达式时,您必须使用 m 选项进行多行匹配
  • 此匹配将为您提供每场匹配的嵌套组,例如 [[log, timestamp],[log, timestamp]]

关于Java正则表达式匹配多行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56388741/

相关文章:

java - 如何在 JFrames 中使用 2 类绘制通过 Netbeans 中的 Palette 生成代码?

Java:当您只能访问接口(interface)时复制对象

java - 这个模式有什么问题吗?

machine-learning - 使用 SIFT 搜索图像数据库

F# 模式匹配可变长度分隔的单词,第一个单词是命令

java - 缺少通过 Java-Wrapper 将表达式添加到 Z3 的批量模式

java - Cloud Foundry Java 启动命令

python - python 的正则表达式

Java从职位描述中提取文本(正则表达式或模式)

javascript - 在方括号和非括号上拆分字符串