我收到了来自一些半不受信任的 API 的响应,该响应应该包含 html。 现在我想将其转换为纯文本,基本上删除所有格式,以便我可以轻松搜索它,然后显示它(部分)。
我想出了这个:
function convertHtmlToText(html) {
const div = document.createElement("div");
// assumpton: because the div is not part of the document
// - no scripts are executed
// - no layout pass
div.innerHTML = html;
// assumption: whitespace is still normalized
// assumption: this returns the text a user would see, if the element was inserted into the DOM.
// Minus the stuff that would depend on stylesheets anyway.
return div.innerText;
}
const html = `
Some random untrusted string that is supposed to contain html.
Presumably some 'rich text'.
A few <div> or <p>, a link or two, a bit of <strong> and some such.
In any case not a complete html document.
`;
const text = convertHtmlToText(html);
const p = document.createElement("p");
p.textContent = text;
document.body.append(p);
我认为这是安全的,因为只要用于转换的div
没有插入到文档中,脚本就不会被执行。
问题:这安全吗?
最佳答案
不,这根本不安全。
function convertHtmlToText(html) {
const div = document.createElement("div");
// assumpton: because the div is not part of the document
// - no scripts are executed
// - no layout pass
div.innerHTML = html;
// assumption: whitespace is still normalized
// assumption: this returns the text a user would see, if the element was inserted into the DOM.
// Minus the stuff that would depend on stylesheets anyway.
return div.innerText;
}
const html = `<img onerror="alert('Gotcha!')" src="">Hi`;
const text = convertHtmlToText(html);
const p = document.createElement("p");
p.textContent = text;
document.body.append(p);
如果您确实只能处理文本内容,那么更喜欢不会执行任何脚本的 DOMParser:
function convertHtmlToText(html) {
const doc = new DOMParser().parseFromString(html, 'text/html');
return doc.body.innerText;
}
const html = `<img onerror="alert('Gotcha!')" src="">Hi`;
const text = convertHtmlToText(html);
const p = document.createElement("p");
p.textContent = text;
document.body.append(p);
但请注意,这些方法还会捕获用户通常无法看到的节点的文本内容(例如 <style>
或 <script>
)。
关于javascript - 这是将 html 转换为文本的安全方法吗,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63824980/