使用Jsoup.clean()
,jsoup 变成 title
HTML 链接的属性来自:
<a href="" title="test <br />">TEST</a>
进入:
<a href="" title="test <br />">TEST</a>
这是演示应用程序:
Whitelist whitelist = new Whitelist();
whitelist.addTags("a");
whitelist.addAttributes("a", "href", "title");
String input = "<a href=\"\" title=\"test <br />\">TEST</a>";
System.out.println("input: " + input);
String output = Jsoup.clean(input, whitelist);
System.out.println("output: " + output);
打印:
input: <a href="" title="test <br />">TEST</a>
output: <a href="" title="test <br />">TEST</a>
我尝试添加OutputSettings
与 EscapeMode
:
OutputSettings outputSettings = new OutputSettings();
outputSettings.escapeMode(EscapeMode.xhtml);
EscapeMode.base
和EscapeMode.extend
没有影响。 EscapeMode.xhtml
打印以下内容:
input: <a href="" title="test <br />">TEST</a>
output: <a href="" title="test <br />">TEST</a>
知道 jsoup 如何不操纵 title
标签?
最佳答案
这是一个已知问题/行为:https://github.com/jhy/jsoup/issues/684 (jsoup 团队标记为“无法修复”)。
There's not a bug here.
When serializing (i.e. in your example when you're printing out XML/HTML), we escape as few characters as necessary. That is why the > is not escaped to >; because it's in a quoted attribute, there's no ambiguity that it's closing a tag, so it doesn't get escaped.
关于java - 如何使用jsoup保留链接标题属性?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43279787/