java - 在java中如何根据字符串过滤出xml中的子节点

标签 java xml xpath

我正在编写一个java程序,它从文件路径读取并创建一个sitemap.xml .

sitemap.xml看起来像这样

<loc>http://localhost/content/falcon/en/index/auto</loc>
<lastMod>2019-12-05</lastMod>
<changefreq>weekly</changefreq>
<priority>0.0</priority>
<testing>admin</testing>
</url>
<url>

<loc>
http://localhost/content/falcon/en/index/auto/coverage
</loc>
<lastMod>2019-09-11</lastMod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
<testing>admin</testing>
</url>

<url>
<loc>
http://localhost/content/falcon/en/index/auto/collectible
</loc>
<lastMod>2019-01-17</lastMod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
<testing>ben.snedeker@tallwave.com</testing>
</url>

<url>
<loc>
http://localhost/content/falcon/en/index/auto/collectible/features-discounts
</loc>
<lastMod>2016-12-30</lastMod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
<testing>usw8453</testing>
</url>

标签内<loc> </loc> 包含一个最初是字符串的url,根据我希望能够过滤掉整个节点的url,包括它的同级标签,如 <lastMod> <changefrequency> <priority>等等

这是正在写入 xml 表的 java

         Resource resource = resourceResolver.getResource(sitemapRootPath);
        if(resource != null) {
            response.setContentType("text/xml;charset=UTF-8");  
            Page page = resource.adaptTo(Page.class);
            Iterator<Page> pageIterator = page.listChildren();

            //Initializing the XML document before writing data into the file
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder;
            try {
                LOG.info("Inside Try");
                builder = factory.newDocumentBuilder();
                Document document = builder.newDocument();

                Element rootElement = document.createElement("urlset");
                rootElement.setAttribute("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");
                document.appendChild(rootElement);

                for(int i = 0; i < staticPageData.length; i ++) {
                    createXMLNodeForStaticPages(document, rootElement, request, staticPageData[i]);
                }

                while(pageIterator.hasNext()) { 



                    createXMLNode(document, rootElement, request, pageIterator);
                }


                Transformer transformer = TransformerFactory.newInstance().newTransformer();
                transformer.setOutputProperty(OutputKeys.INDENT, "yes");
                transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");

                //initialize StreamResult with File object to save to file
                StreamResult result = new StreamResult(new StringWriter());
                DOMSource source = new DOMSource(document);
                transformer.transform(source, result);
                String xmlString = result.getWriter().toString();
                out.print(xmlString);

这是上面的 while 循环调用的方法。它也在底部的 for 循环中写入子页面的 xml。

public void createXMLNode(Document document, Element rootElement, SlingHttpServletRequest request, Iterator<Page> pageIterator) {
        Element headElement = document.createElement("url");
        Element locElement = document.createElement("loc");
        Element lastModElement = document.createElement("lastMod");
        Element changefreqElement = document.createElement("changefreq");
        Element priorityElement = document.createElement("priority");
        Element testingElement = document.createElement("testing");

        Node locElementNode = locElement;
        Node lastModElementNode = lastModElement;
        Node changefreqElementNode = changefreqElement;
        Node priorityElementNode = priorityElement;
        Node testingElementNode = testingElement;

        Page childPage = pageIterator.next();
        String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
        locElementNode.setTextContent(location);


        LOG.error("childPage.getLastModified()" + childPage.getLastModified());
        if(null != childPage.getLastModified()) {
            Date date = childPage.getLastModified().getTime();
            DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd", Locale.US);
            try {
                dateFormat.parse("2019-07-15");
            } catch (ParseException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            lastModElementNode.setTextContent(dateFormat.format(date));
        }


       String editor = childPage.getLastModifiedBy();

        changefreqElementNode.setTextContent("weekly");


        priorityElementNode.setTextContent(PriorityValue(location));
        testingElementNode.setTextContent(editor);


        rootElement.appendChild(headElement);
        headElement.appendChild(locElementNode);
        headElement.appendChild(lastModElementNode);
        headElement.appendChild(changefreqElementNode);
        headElement.appendChild(priorityElementNode);
        headElement.appendChild(testingElementNode);

        Iterator<Page> childPageIterator =  childPage.listChildren();




        while(childPageIterator.hasNext()) {

            createXMLNode(document, rootElement, request, childPageIterator);
        }
    }

我希望能够在读取某个字符串时跳过整个子节点。 例如最初 loc 内的属性只是从该 java 类读取的文件路径中读取的字符串。

        String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
locElementNode.setTextContent(location);

它被放置在一个可变位置,然后我们设置 locElementNode具有该值。 我希望能够在读取某个 url 字符串时过滤掉整个节点。 while 循环应跳至 next 的下一个元素。

最佳答案

好吧,您需要做的只是在创建任何元素并附加它们之前添加逻辑来检查返回的字符串


public void createXMLNode(Document document, Element rootElement, SlingHttpServletRequest request, Iterator<Page> pageIterator) {

        String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
        if (location.equals("<banned url>") {
            return;
        }
        Element headElement = document.createElement("url");
        Element locElement = document.createElement("loc");
        Element lastModElement = document.createElement("lastMod");
        Element changefreqElement = document.createElement("changefreq");
        Element priorityElement = document.createElement("priority");
        Element testingElement = document.createElement("testing");

        Node locElementNode = locElement;
        Node lastModElementNode = lastModElement;
        Node changefreqElementNode = changefreqElement;
        Node priorityElementNode = priorityElement;
        Node testingElementNode = testingElement;

        Page childPage = pageIterator.next();        
        locElementNode.setTextContent(location);

 ....

关于java - 在java中如何根据字符串过滤出xml中的子节点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59293032/

相关文章:

java - 在 Go 中是否有等同于 Java 的 String intern 函数?

php - 生成基本 OAuth header

java - 除了使用 Xpath 之外还有其他方法吗?

java - Cloud Firestore 无法识别字段名称更改

java - ArrayList<Integer[]> 在 java 中不起作用

Java - KeyPairGenerator.Initialize(int,SecureRandom)NullPointerException

xml - 如何使用 gVim 格式化未格式化的 XML 文档?

javascript - 尝试使用 Javascript/Jquery 解析 XML 文件时出现错误

xpath - Selenium - @FindBy 和 WebElement.findElement() 之间的区别

java - 使用 EclipseLink MOXy 读取同一元素两次