我有两个由声纳生成的 html 报告,显示了我的代码中的问题。
问题陈述:我需要比较两个声纳报告并找出差异,即引入的新问题。基本上需要找到 html 中的差异并仅打印这些差异。
我尝试了一些东西 -
import difflib
file1 = open('sonarlint-report.html', 'r').readlines()
file2 = open('sonarlint-report_latest.html', 'r').readlines()
htmlDiffer = difflib.HtmlDiff()
htmldiffs = htmlDiffer.make_file(file1, file2)
with open('comparison.html', 'w') as outfile:
outfile.write(htmldiffs)
现在这给了我一个 Comparison.html,它只不过是两个 html diff。不只打印不同的行。
我应该尝试 HTML 解析,然后以某种方式获取差异并仅将其打印出来吗?请提出建议。
最佳答案
如果您使用difflib.Differ
,您可以仅保留差异行,并通过每行写入的两个字母代码进行过滤。来自 docs :
class difflib.Differ
This is a class for comparing sequences of lines of text, and producing human-readable differences or deltas. Differ uses SequenceMatcher both to compare sequences of lines, and to compare sequences of characters within similar (near-matching) lines.
Each line of a Differ delta begins with a two-letter code:
Code Meaning
'- ' line unique to sequence 1
'+ ' line unique to sequence 2
' ' line common to both sequences
'? ' line not present in either inputsequence
Lines beginning with ‘?’ attempt to guide the eye to intraline differences, and were not present in either input sequence. These lines can be confusing if the sequences contain tab characters
保持以“-”和“+”开头的行只是区别。
关于python - 如何在 python 中比较两个 HTML 文件并仅打印差异?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51928222/