我正在编写一个java程序,它从文件路径读取并创建一个sitemap.xml
.
sitemap.xml
看起来像这样
<loc>http://localhost/content/falcon/en/index/auto</loc>
<lastMod>2019-12-05</lastMod>
<changefreq>weekly</changefreq>
<priority>0.0</priority>
<testing>admin</testing>
</url>
<url>
<loc>
http://localhost/content/falcon/en/index/auto/coverage
</loc>
<lastMod>2019-09-11</lastMod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
<testing>admin</testing>
</url>
<url>
<loc>
http://localhost/content/falcon/en/index/auto/collectible
</loc>
<lastMod>2019-01-17</lastMod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
<testing>ben.snedeker@tallwave.com</testing>
</url>
<url>
<loc>
http://localhost/content/falcon/en/index/auto/collectible/features-discounts
</loc>
<lastMod>2016-12-30</lastMod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
<testing>usw8453</testing>
</url>
标签内<loc> </loc>
包含一个最初是字符串的url,根据我希望能够过滤掉整个节点的url,包括它的同级标签,如 <lastMod> <changefrequency> <priority>
等等
这是正在写入 xml 表的 java
Resource resource = resourceResolver.getResource(sitemapRootPath);
if(resource != null) {
response.setContentType("text/xml;charset=UTF-8");
Page page = resource.adaptTo(Page.class);
Iterator<Page> pageIterator = page.listChildren();
//Initializing the XML document before writing data into the file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
LOG.info("Inside Try");
builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element rootElement = document.createElement("urlset");
rootElement.setAttribute("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");
document.appendChild(rootElement);
for(int i = 0; i < staticPageData.length; i ++) {
createXMLNodeForStaticPages(document, rootElement, request, staticPageData[i]);
}
while(pageIterator.hasNext()) {
createXMLNode(document, rootElement, request, pageIterator);
}
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(document);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
out.print(xmlString);
这是上面的 while 循环调用的方法。它也在底部的 for 循环中写入子页面的 xml。
public void createXMLNode(Document document, Element rootElement, SlingHttpServletRequest request, Iterator<Page> pageIterator) {
Element headElement = document.createElement("url");
Element locElement = document.createElement("loc");
Element lastModElement = document.createElement("lastMod");
Element changefreqElement = document.createElement("changefreq");
Element priorityElement = document.createElement("priority");
Element testingElement = document.createElement("testing");
Node locElementNode = locElement;
Node lastModElementNode = lastModElement;
Node changefreqElementNode = changefreqElement;
Node priorityElementNode = priorityElement;
Node testingElementNode = testingElement;
Page childPage = pageIterator.next();
String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
locElementNode.setTextContent(location);
LOG.error("childPage.getLastModified()" + childPage.getLastModified());
if(null != childPage.getLastModified()) {
Date date = childPage.getLastModified().getTime();
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd", Locale.US);
try {
dateFormat.parse("2019-07-15");
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
lastModElementNode.setTextContent(dateFormat.format(date));
}
String editor = childPage.getLastModifiedBy();
changefreqElementNode.setTextContent("weekly");
priorityElementNode.setTextContent(PriorityValue(location));
testingElementNode.setTextContent(editor);
rootElement.appendChild(headElement);
headElement.appendChild(locElementNode);
headElement.appendChild(lastModElementNode);
headElement.appendChild(changefreqElementNode);
headElement.appendChild(priorityElementNode);
headElement.appendChild(testingElementNode);
Iterator<Page> childPageIterator = childPage.listChildren();
while(childPageIterator.hasNext()) {
createXMLNode(document, rootElement, request, childPageIterator);
}
}
我希望能够在读取某个字符串时跳过整个子节点。
例如最初 loc
内的属性只是从该 java 类读取的文件路径中读取的字符串。
String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
locElementNode.setTextContent(location);
它被放置在一个可变位置,然后我们设置 locElementNode
具有该值。
我希望能够在读取某个 url 字符串时过滤掉整个节点。 while 循环应跳至 next 的下一个元素。
最佳答案
好吧,您需要做的只是在创建任何元素并附加它们之前添加逻辑来检查返回的字符串
public void createXMLNode(Document document, Element rootElement, SlingHttpServletRequest request, Iterator<Page> pageIterator) {
String location = request.getScheme() + "://" + request.getServerName() + childPage.getPath();
if (location.equals("<banned url>") {
return;
}
Element headElement = document.createElement("url");
Element locElement = document.createElement("loc");
Element lastModElement = document.createElement("lastMod");
Element changefreqElement = document.createElement("changefreq");
Element priorityElement = document.createElement("priority");
Element testingElement = document.createElement("testing");
Node locElementNode = locElement;
Node lastModElementNode = lastModElement;
Node changefreqElementNode = changefreqElement;
Node priorityElementNode = priorityElement;
Node testingElementNode = testingElement;
Page childPage = pageIterator.next();
locElementNode.setTextContent(location);
....
关于java - 在java中如何根据字符串过滤出xml中的子节点,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59293032/