尝试解析具有 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"实体的文档时出现 java.net.MalformedURLException

标签 java entity html-parsing xerces

我正在使用 Java 6 并尝试解析以

开头的格式良好的文档
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

我下载了实体 DTD“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”,并将其放置在我的类路径中。但是,当我尝试使用(注意自定义实体解析器的使用)解析文档时...

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    factory.setValidating(false);
    factory.setExpandEntityReferences(false);
    final DocumentBuilder builder = factory.newDocumentBuilder();
    builder.setEntityResolver(new EntityResolver() {
        @Override
        public InputSource resolveEntity(String publicId, String systemId) {
            InputSource inputSource = null;
            try {
                final String resource = systemId.substring(systemId.lastIndexOf("/") + 1);
                final InputStream inputStream = getClass().getClassLoader().getResourceAsStream(resource);
                inputSource = new InputSource(inputStream);
            } catch (Exception e) {
                // No action; just let the null InputSource pass through
            }
            return inputSource;
        }
    });
    final InputSource s = new InputSource(new StringReader(str));
    org.w3c.dom.Document result = builder.parse(s);

我在“builder.parse(s)”行上收到以下异常。我怎样才能无异常地解析文档,或者至少怎样才能找出这个东西提示的URL是什么?

java.net.MalformedURLException
    at java.net.URL.<init>(URL.java:601)
    at java.net.URL.<init>(URL.java:464)
    at java.net.URL.<init>(URL.java:413)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:650)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1194)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1090)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1003)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
    at com.myco.myproject.util.XmlUtilities.getStringAsDocument(XmlUtilities.java:147)
    at com.myco.myproject.util.NetUtilities.getUrlAsDocument(NetUtilities.java:67)
    at com.myco.myproject.parsers.impl.InstitutoCervantes.parsePage(InstitutoCervantes.java:84)
    at com.myco.myproject.parsers.impl.InstitutoCervantes.getEvents(InstitutoCervantes.java:52)
    at com.myco.myproject.domain.EventFeed.refresh(EventFeed.java:81)
    at com.myco.myproject.domain.EventFeed.getEvents(EventFeed.java:72)
    at com.myco.myproject.parsers.impl.InstitutoCervantesParserTest.testParser(InstitutoCervantesParserTest.java:24)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
    at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:74)
    at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:83)
    at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:72)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:231)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
    at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
    at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:71)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
    at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:174)
    at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
    at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
    at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

最佳答案

检查“resource”的值是什么,并确保它指向本地计算机中的正确文件。

关于尝试解析具有 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"实体的文档时出现 java.net.MalformedURLException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10014849/

相关文章:

java - Spring:如何在投影中使用SpEL表达式

python - 使用 Python 2.7 解析 HTML - HTMLParser、SGMLParser 或 Beautiful Soup?

python - 使用 Beautiful Soup + Requests 时 find_all() 未找到任何结果

java - Android MediaPlayer.OnCompletionListener() 的意外行为

java - jsf数据表限制用户结果

java - 从数组到数组创建新对象

c# - Entity Framework 协会 : Error Because the Dependent Role properties are not the key properties

java - JUnit 测试 - 参数化测试 - 'No implicit conversion of type java.lang.Integer to type [Ljava.lang.Integer;'

php - Symfony/Form:函数 DoctrineType::__construct() 的参数太少

python - BeautifulSoup 在使用 findAll(text =' ') 后返回下一个 sibling