以字符串作为输出的 Java Saxon xPath 示例

标签 java xml xpath saxon

我正在尝试编写将使用 Saxon xPath 的 Java 代码。我有两个问题:

  1. 我的java不是很好
  2. 我不确定将 net.sf.saxon.om.NodeInfo 转换为字符串的最佳方法是什么。

有人可以帮忙吗?我知道 http://www.saxonica.com/download/download_page.xml 有一些很好的示例代码但这还不够。

看到类似的SO讨论XPath processor output as string .但是在这种情况下,我想使用 Saxon,它使用 NodeInfo。

<pre>
<!-- language: java --> 
public class helloSaxon {
    public static void main(String[] args) {
        String xml = "";
        String xPathStatement = "";
        String xPathResult = "";
        SaxonXPath xPathEvaluation = null;
        Boolean xPathResultMatch = false;
        
        xml="<root><a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><b><a>#DDD#</a></b></root>";

        //I'm using the following XPath Tester for test scenarios
        //https://www.freeformatter.com/xpath-tester.html#ad-output
        // Test #1
        xPathStatement="/root/a";
        xPathEvaluation = new SaxonXPath(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #1 xPathResult - " + xPathResult);
            //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #1 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #2
        xPathStatement="//a";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #2 xPathResult - " + xPathResult);
            //xPathResult == "<a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><a>#DDD#</a>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #2 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #3
        xPathStatement="/root/a[1]/text()";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #3 xPathResult - " + xPathResult);
            //xPathResult == "#BBB#";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #3 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;

        // Test #4
        xPathStatement="/a/root/a/text()";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #4 xPathResult - " + xPathResult);
            //xPathResult == "";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #4 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == false;
            
        // Test #5
        xPathStatement="/root";
        xPathEvaluation.Reset(xml, xPathStatement);
        xPathResult = xPathEvaluation.getxPathResult();
            System.out.println("Test #5 xPathResult - " + xPathResult);
            //xPathResult == "<root><a version = '1.0' encoding = 'UTF-8'>#BBB#</a><a>#CCC#</a><b><a>#DDD#</a></b></root>";
        xPathResultMatch = xPathEvaluation.getxPathResultMatch();
            System.out.println("Test #5 xPathResultMatch - " + xPathResultMatch);
            //xPathResultMatch == true;         
    }
    static class SaxonXPath{
        private String xml;
        private String xPathStatement;
        private String xPathResult;
        private Boolean xPathResultMatch;
        public SaxonXPath(String xml, String xPathStatement){
            this.Reset(xml, xPathStatement);
        }
        public void Reset(String xml, String xPathStatement){
            this.xml = xml;
            this.xPathStatement = xPathStatement;
            this.xPathResult = "";
            this.xPathResultMatch = null;
            this.Evaluate();
        }
        public void Evaluate(){
            try{
                System.setProperty("javax.xml.xpath.XPathFactory:" + NamespaceConstant.OBJECT_MODEL_SAXON, "net.sf.saxon.xpath.XPathFactoryImpl");
                XPathFactory xPathFactory = XPathFactory.newInstance(NamespaceConstant.OBJECT_MODEL_SAXON);
                XPath xPath = xPathFactory.newXPath();
                InputSource inputSource = new InputSource(new StringReader(this.xml));
                SAXSource saxSource = new SAXSource(inputSource);
                Configuration config = ((XPathFactoryImpl) xPathFactory).getConfiguration();
                DocumentInfo document = config.buildDocument(saxSource);      
                XPathExpression xPathExpression = xPath.compile(this.xPathStatement);

                List matches = (List) xPathExpression.evaluate(document, XPathConstants.NODESET);
                if (matches != null && matches.size()>0) {
                    this.xPathResultMatch = true;   
                    for (Iterator iter = matches.iterator(); iter.hasNext();) {
                        NodeInfo node = (NodeInfo) iter.next();
                        
                        //need to convert content of "node" to string
                        xPathResult += node.getStringValue();
                    }
                } else {
                    this.xPathResultMatch = false;
                }
            } catch(Exception e){
                e.printStackTrace();
            }           
        }
        public String getxPathResult(){
            return this.xPathResult;
        }
        public Boolean getxPathResultMatch(){
            return this.xPathResultMatch;
        }
    }
}
</code>

将有以下输入:

  1. XML 作为字符串
  2. xPath 表达式为字符串
    输出:
  3. xPath 评估为字符串
  4. xPath 结果匹配为 boolean 值

我还在代码注释中添加了一些测试示例,以便您更好地理解我正在尝试做的事情。

最佳答案

首先,我建议为此使用 s9api 接口(interface)而不是 JAXP XPath 接口(interface)。原因有很多,特别是:

  • JAXP 接口(interface)非常适合 XPath 1.0,例如它只能识别字符串、数字、 boolean 值和节点集等数据类型。 XPath 2.0 具有更丰富的类型系统

  • JAXP 接口(interface)相当依赖于 DOM 作为其对象模型,尽管它对使用其他模型的可能性做出了让步(并且 Saxon 实现通过支持 NodeInfo 来利用这一点,它是 XDM 节点的实现)

  • JAXP 接口(interface)几乎没有类型安全;它广泛使用 Object 作为参数和结果类型,并且不使用 Java 泛型

  • 使用标准 API 的任何可移植性优势都是虚假的,因为 (a) 除了 Saxon 之外的所有已知实现仅支持 XPath 1.0,以及 (b) 可以提供给声明为接受 Object 每个产品都不同。

每次计算 XPath 表达式时,您的代码都会创建一个新的 XPathFactory。创建 XPathFactory 是一项非常昂贵的操作,因为它涉及搜索类路径并检查许多不同的 JAR 文件以查看哪个包含合适的 XPath 引擎。

此外,每次对 XPath 表达式求值时,您的代码都会从头开始构建源文档。同样,这非常昂贵。

话虽如此,使用 JAXP 返回字符串和 boolean 值并不是很困难。您只需将说明预期结果类型的参数 XPathConstants.NODESET 更改为 XPathConstants.STRINGXPathConstants.BOOLEAN,以及 evaluate() 调用将返回一个字符串或 boolean 值来代替节点列表。但是,如果您想要返回日期或持续时间,您会被卡住,因为 JAXP 不支持它。

关于以字符串作为输出的 Java Saxon xPath 示例,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57868984/

相关文章:

java - 使用 JAXB 编码/解码字段以标记属性

java - 将 @EmbeddedId 与 JpaRepository 一起使用

java - Ant 忽略覆盖 LANG 环境变量的尝试

xml - Excel VBA 从 XML 获取特定节点

xml - XPath name()函数是否用于当前元素?

php - 如何更改 DOM 中元素的名称?

java - 如何更改字符串编码

java - Lucene 索引 - 单个术语和短语查询

python - 使用 Python 规范化空格

javascript - perl 使用 HTML::Treebuilder 查找不同的元素 id