我正在使用 import org.jdom.* 在 java 中编写应用程序;
我的 XML 是有效的,但有时它包含 HTML 标签。例如,像这样:
<program-title>Anatomy & Physiology</program-title>
<overview>
<content>
For more info click <a href="page.html">here</a>
<p>Learn more about the human body. Choose from a variety of Physiology (A&P) designed for complementary therapies.&#160; Online studies options are available.</p>
</content>
</overview>
<key-information>
<category>Health & Human Services</category>
所以我的问题是 overview.content 节点中的 < p > 标签。
我希望这段代码能起作用:
Element overview = sds.getChild("overview");
Element content = overview.getChild("content");
System.out.println(content.getText());
但它返回空白。
如何从 overview.content 节点返回所有文本(嵌套标签和所有文本)?
谢谢
最佳答案
content.getText()
提供即时文本,这仅对具有文本内容的叶元素有用。
技巧是使用 org.jdom.output.XMLOutputter
(使用文本模式 CompactFormat
)
public static void main(String[] args) throws Exception {
SAXBuilder builder = new SAXBuilder();
String xmlFileName = "a.xml";
Document doc = builder.build(xmlFileName);
Element root = doc.getRootElement();
Element overview = root.getChild("overview");
Element content = overview.getChild("content");
XMLOutputter outp = new XMLOutputter();
outp.setFormat(Format.getCompactFormat());
//outp.setFormat(Format.getRawFormat());
//outp.setFormat(Format.getPrettyFormat());
//outp.getFormat().setTextMode(Format.TextMode.PRESERVE);
StringWriter sw = new StringWriter();
outp.output(content.getContent(), sw);
StringBuffer sb = sw.getBuffer();
System.out.println(sb.toString());
}
输出
For more info click<a href="page.html">here</a><p>Learn more about the human body. Choose from a variety of Physiology (A&P) designed for complementary therapies.&#160; Online studies options are available.</p>
探索其他 formatting选项并根据您的需要修改上面的代码。
"Class to encapsulate XMLOutputter format options. Typical users can use the standard format configurations obtained by getRawFormat() (no whitespace changes), getPrettyFormat() (whitespace beautification), and getCompactFormat() (whitespace normalization). "
关于java - 如何从JDOM获取节点内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7910474/