为什么非验证DocumentBuilder在SSCCE下面尝试读取 DTD 文件?
public class FooMain {
private static String XML_INSTANCE = "<?xml version=\"1.0\"?> "+
"<!DOCTYPE note SYSTEM \"does-not-exist.dtd\"> "+
"<a/> ";
public static void main(String args[]) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream is = new ByteArrayInputStream(XML_INSTANCE.getBytes("UTF-8"));
Document doc = builder.parse(is);
}
}
代码爆炸:
[java] Exception in thread "main" java.io.FileNotFoundException: /lhome/minimal-for-SO/does-not-exist.dtd (No such file or directory)
[java] at java.io.FileInputStream.open(Native Method)
[java] at java.io.FileInputStream.<init>(FileInputStream.java:146)
[java] at java.io.FileInputStream.<init>(FileInputStream.java:101)
[java] at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
[java] at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
[java] at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
[java] at org.apache.xerces.impl.XMLEntityManager.startEntity(Unknown Source)
[java] at org.apache.xerces.impl.XMLEntityManager.startDTDEntity(Unknown Source)
[java] at org.apache.xerces.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
[java] at org.apache.xerces.impl.XMLDocumentScannerImpl$DTDDispatcher.dispatch(Unknown Source)
[java] at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
[java] at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
[java] at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
[java] at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
[java] at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
[java] at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
[java] at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
[java] at FooMain.main(FooMain.java:35)
鉴于构建器是非验证的,如果找不到文件(如果不完全跳过对 DTD 文件的搜索),我希望至少不会崩溃。那么,是什么阻止了文档被解析,因为构建器是非验证的,因此不需要访问 DTD?
最佳答案
为了忽略 DTD 指令和引用,您必须设置更多的标志:
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
如果您正在构建 Web 应用程序,我建议您全局禁用解析 DTD 实体,因为它可能存在潜在的安全漏洞。
例如:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///dev/random" >]><foo>&xxe;</foo>
尝试将/dev/random 中的内容插入 &xxe 时会导致服务器崩溃。
关于java - 尝试读取 DTD 文件的非验证 DocumentBuilder,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24744175/