python - 如何在 python 中比较两个 HTML 文件并仅打印差异？

我有两个由声纳生成的 html 报告，显示了我的代码中的问题。

问题陈述:我需要比较两个声纳报告并找出差异，即引入的新问题。基本上需要找到 html 中的差异并仅打印这些差异。

我尝试了一些东西 -

import difflib
file1 = open('sonarlint-report.html', 'r').readlines()
file2 = open('sonarlint-report_latest.html', 'r').readlines()

htmlDiffer = difflib.HtmlDiff()
htmldiffs = htmlDiffer.make_file(file1, file2)

with open('comparison.html', 'w') as outfile:
    outfile.write(htmldiffs)

现在这给了我一个 Comparison.html，它只不过是两个 html diff。不只打印不同的行。

我应该尝试 HTML 解析，然后以某种方式获取差异并仅将其打印出来吗？请提出建议。

最佳答案

如果您使用difflib.Differ，您可以仅保留差异行，并通过每行写入的两个字母代码进行过滤。来自 docs :

class difflib.Differ

This is a class for comparing sequences of lines of text, and producing human-readable differences or deltas. Differ uses SequenceMatcher both to compare sequences of lines, and to compare sequences of characters within similar (near-matching) lines.

Each line of a Differ delta begins with a two-letter code:

Code Meaning

'- ' line unique to sequence 1

'+ ' line unique to sequence 2

' ' line common to both sequences

'? ' line not present in either inputsequence

Lines beginning with ‘?’ attempt to guide the eye to intraline differences, and were not present in either input sequence. These lines can be confusing if the sequences contain tab characters

保持以“-”和“+”开头的行只是区别。

关于python - 如何在 python 中比较两个 HTML 文件并仅打印差异？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/51928222/

上一篇：Angular flexLayout 组件重叠

下一篇：transformation - 如何对摇动中的小数值进行四舍五入？

python - 在 python 中，为什么 'is' 比 '==' 更适合用于检查对象是否为 None

python - 使用 Django ORM，如何为所有可能的组合创建唯一的散列

python - 读取包含值列表到数组中的文件

python - python 字符串中的 dict 和固定参数

html - 制作 4 部分 CSS 固定导航栏，遇到布局问题

javascript - 为什么一个行内 block 元素比其他元素略高

PHP if ( $some_var == 1 ) 总是返回 true，即使它不是 true？

ios - 确定 CKRecord 是否由当前用户创建

python - 我可以在 yaml/pyyaml 中转储空白而不是 null 吗？