regex - 这组正则表达式是否能够完全防止跨站点脚本攻击？

以下代码不会捕获的危险示例是什么？

编辑:在一些评论之后，我添加了另一行，在下面评论。请参阅 David Grant 的回答中 Vinko 的评论。到目前为止，只有 Vinko 回答了这个问题，该问题要求提供可以跳过此函数的具体示例。 Vinko 提供了一个，但我编辑了代码来弥补这个漏洞。如果你们中的另一个人能想到另一个具体的例子，我就会投票给你们!

public static string strip_dangerous_tags(string text_with_tags)
{
    string s = Regex.Replace(text_with_tags, @"<script", "<scrSAFEipt", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"</script", "</scrSAFEipt", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"<object", "</objSAFEct", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"</object", "</obSAFEct", RegexOptions.IgnoreCase);
    // ADDED AFTER THIS QUESTION WAS POSTED
    s = Regex.Replace(s, @"javascript", "javaSAFEscript", RegexOptions.IgnoreCase);

    s = Regex.Replace(s, @"onabort", "onSAFEabort", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onblur", "onSAFEblur", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onchange", "onSAFEchange", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onclick", "onSAFEclick", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"ondblclick", "onSAFEdblclick", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onerror", "onSAFEerror", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onfocus", "onSAFEfocus", RegexOptions.IgnoreCase);

    s = Regex.Replace(s, @"onkeydown", "onSAFEkeydown", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onkeypress", "onSAFEkeypress", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onkeyup", "onSAFEkeyup", RegexOptions.IgnoreCase);

    s = Regex.Replace(s, @"onload", "onSAFEload", RegexOptions.IgnoreCase);

    s = Regex.Replace(s, @"onmousedown", "onSAFEmousedown", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onmousemove", "onSAFEmousemove", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onmouseout", "onSAFEmouseout", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onmouseup", "onSAFEmouseup", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onmouseup", "onSAFEmouseup", RegexOptions.IgnoreCase);

    s = Regex.Replace(s, @"onreset", "onSAFEresetK", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onresize", "onSAFEresize", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onselect", "onSAFEselect", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onsubmit", "onSAFEsubmit", RegexOptions.IgnoreCase);
    s = Regex.Replace(s, @"onunload", "onSAFEunload", RegexOptions.IgnoreCase);

    return s;
}

最佳答案

这永远不够 - 白名单，不要黑名单

例如javascript:伪 URL 可以与 HTML 实体混淆，您已经忘记了 <embed>并且存在危险的 CSS 属性，例如 behavior和expression在IE中。

有countless ways逃避过滤器，这种方法注定会失败。即使您今天发现并阻止了所有可能的利用，将来也可能会添加新的不安全元素和属性。

保护 HTML 安全的好方法只有两种:

通过替换每个 < 将其转换为文本与 < .
如果您想允许用户输入格式化文本，您可以使用自己的标记(例如像 SO 那样的 markdown)。
将 HTML 解析为 DOM，检查每个元素和属性并删除所有未列入白名单的内容。
您还需要检查允许的属性的内容，例如 href (确保URL使用安全协议(protocol)，阻止所有未知协议(protocol))。
清理完 DOM 后，即可从中生成新的有效 HTML。切勿像处理文本一样处理 HTML，因为无效的标记、注释、实体等很容易欺骗您的过滤器。

还要确保您的页面声明其编码，因为存在利用浏览器自动检测错误编码的漏洞。

关于regex - 这组正则表达式是否能够完全防止跨站点脚本攻击？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/195648/

regex - 这组正则表达式是否能够完全防止跨站点脚本攻击？

这永远不够 - 白名单，不要黑名单

上一篇：delphi - TmemoryStream 服务器接收流时内存不足

下一篇：java - 当我单击按钮时，我的 PopUpWindow 未打开