java - XML解析、动态结构、内容

标签 java xml domparser

想要实现:

获取未知 XML 文件的元素(元素名称、xml 文件中有多少个元素)。

然后获取所有属性及其名称和值以供稍后使用(例如与其他 xml 文件进行比较)

element_vs_attribute

研究: 1. 2. 3. 4. 5. 还有更多

有人对此有任何想法吗?

我不想像前面的代码片段那样预先定义超过 500 个表,不知何故我应该能够动态获取元素数量和元素名称本身。

编辑!

Example1
<Root Attri1="" Attri2="">
    <element1 EAttri1="" EAttri2=""/>
    <Element2 EAttri1="" EAttri2="">
        <nestedelement3 NEAttri1="" NEAttri2=""/>
    </Element2> 
</Root>

Example2
<Root Attri1="" Attri2="" Attr="" At="">
    <element1 EAttri1="" EAttri2="">
        <nestedElement2 EAttri1="" EAttri2="">
            <nestedelement3 NEAttri1="" NEAttri2=""/>
        </nestedElement2>
    </element1> 
</Root>

程序片段:

String Example1[] = {"element1","Element2","nestedelement3"};
String Example2[] = {"element1","nestedElement2","nestedelement3"};


for(int i=0;i<Example1.length;++){
    NodeList Elements = oldDOC.getElementsByTagName(Example1[i]);
    for(int j=0;j<Elements.getLength();j++) {
        Node nodeinfo=Elements.item(j);
        for(int l=0;l<nodeinfo.getAttributes().getLength();l++) {
        .....
    }
}

输出: 预期结果是从 XML 文件中获取所有元素和所有属性,而不需要预先定义任何内容。

例如:

Elements: element1 Element2 nestedelement3

Attributes:  Attri1 Attri2 EAttri1 EAttri2 EAttri1 EAttri2 NEAttri1 NEAttri2

最佳答案

适合这项工作的工具是xpath 它允许您根据各种标准收集全部或部分元素和属性。它是最接近“通用”XML 解析器的。

这是我想出的解决方案。该解决方案首先查找给定 xml 文档中的所有元素名称,然后对于每个元素,计算该元素的出现次数,然后将其全部收集到映射中。属性也是如此。
我添加了内联注释,方法/变量名称应该是不言自明的。

import java.io.*;
import java.nio.file.*;
import java.util.*;
import java.util.function.*;
import java.util.stream.*;

import org.w3c.dom.*;

import javax.xml.parsers.*;
import javax.xml.xpath.*;

public class TestXpath
{

    public static void main(String[] args) {

        XPath xPath = XPathFactory.newInstance().newXPath();

        try (InputStream is = Files.newInputStream(Paths.get("C://temp/test.xml"))) {
            // parse file into xml doc
            DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
            Document xmlDocument = builder.parse(is);

            // find all element names in xml doc
            Set<String> allElementNames = findNames(xmlDocument, xPath.compile("//*[name()]"));
            // for each name, count occurrences, and collect to map
            Map<String, Integer> elementsAndOccurrences = allElementNames.stream()
                .collect(Collectors.toMap(Function.identity(), name -> countElementOccurrences(xmlDocument, name)));
            System.out.println(elementsAndOccurrences);

            // find all attribute names in xml doc
            Set<String> allAttributeNames = findNames(xmlDocument, xPath.compile("//@*"));
            // for each name, count occurrences, and collect to map
            Map<String, Integer> attributesAndOccurrences = allAttributeNames.stream()
                .collect(Collectors.toMap(Function.identity(), name -> countAttributeOccurrences(xmlDocument, name)));
            System.out.println(attributesAndOccurrences);

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public static Set<String> findNames(Document xmlDoc, XPathExpression xpathExpr) {
        try {
            NodeList nodeList = (NodeList)xpathExpr.evaluate(xmlDoc, XPathConstants.NODESET);
            // convert nodeList to set of node names
            return IntStream.range(0, nodeList.getLength())
                .mapToObj(i -> nodeList.item(i).getNodeName())
                .collect(Collectors.toSet());
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }
        return new HashSet<>();
    }

    public static int countElementOccurrences(Document xmlDoc, String elementName) {
        return countOccurrences(xmlDoc, elementName, "count(//*[name()='" + elementName + "'])");
    }

    public static int countAttributeOccurrences(Document xmlDoc, String attributeName) {
        return countOccurrences(xmlDoc, attributeName, "count(//@*[name()='" + attributeName + "'])");
    }

    public static int countOccurrences(Document xmlDoc, String name, String xpathExpr) {
        XPath xPath = XPathFactory.newInstance().newXPath();
        try {
            Number count = (Number)xPath.compile(xpathExpr).evaluate(xmlDoc, XPathConstants.NUMBER);
            return count.intValue();
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }
        return 0;
    }
}

关于java - XML解析、动态结构、内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50132789/

相关文章:

java - BufferedReader 似乎缺少尾随换行符?

.xml 文件中的 PHP 错误?

html - XSLT 抽象

xml - JAXB 中 UnMarshaller 和 Parser 的区别

javascript - Javascript XML Reader 返回空 HTMLCollection

java - 选择要序列化的属性

java - Mongo 依靠集合还是依靠游标 - 哪个更快

Java:仅替换文件中的一行/字符串

java - 如何在android的操作栏中创建按钮

javascript - 如何使用 JavaScript 从 div 标签中过滤掉 HTML 标签