可能有人之前问过这个问题,但我找不到解决方案,所以发布这个问题。
我需要解析下面的 HTML 字符串来查找每个项目的 ID、时间和主题:
<div class="list" id="1">
<div class="time">12:01 PM</div>
<div class="subject">[This is dummy Subject1] This is some dummy strings after subject</div>
<div/>
<div class="list" id="2">
<div class="time">12:01 PM</div>
<div class="subject">[This is dummy Subject2] This is some dummy strings after subject</div>
<div/>
<div class="list" id="3">
<div class="time">12:01 PM</div>
<div class="subject">[This is dummy Subject3] This is some dummy strings after subject</div>
<div/>
输出需要类似于:id|time|subject
。
最佳答案
请参阅此处的演示 https://regex101.com/r/fN1fZ0/1
var re = /.*?id="(.*?)".*?time">(.*?)<\/.*?subject">\[(.*?)\].*?|.*$/gs;
var subst = '$1|$2|$3\n';
var result = str.replace(re, subst);
关于java - 使用 Java 正则表达式匹配重复的 HTML 模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29231105/