xml - Perl XML::XPath 在文档中添加一堆垃圾

标签 xml perl xpath

我有一个想要通过 XPATH 更新的 web.xml。我注意到所需的元素已正确修改,但在文档的开头添加了一堆垃圾。我注意到即使我不修改任何元素,只解析和打印,我也会得到那个垃圾。

编码:

require Cwd;
use File::Temp qw/ tempfile tempdir/;
use lib 'menu/perl-modules/lib/site_perl';
use XML::XPath;
use XML::XPath::NodeSet;
#use strict;

$file = "/tmp/web.xml";
my $xp   = XML::XPath->new( filename => $file );
my $root = $xp->find('/')->get_nodelist;
#$xp->setNodeText( $xpath, $newValue );

open( XPATH_FILE, "> $file" );
foreach my $nodes ( $xp->find('/')->get_nodelist ) {
  print XPATH_FILE $nodes->toString;
}
close(XPATH_FILE);

输入文件:
<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
   <filter>
      <filter-name>LocaleFilter</filter-name>
      ....
</web-app>

输出:文档开头的大约 700 行注释,看起来像是引用的 dtd 的某种扩展或其他东西。为了便于阅读,我只包括前几行:
<!--
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER.

Copyright 2000-2007 Sun Microsystems, Inc. All rights reserved.

The contents of this file are subject to the terms of either the GNU
General Public License Version 2 only ("GPL") or the Common Development
and Distribution License("CDDL") (collectively, the "License").  You
may not use this file except in compliance with the License. You can obtain
a copy of the License at https://glassfish.dev.java.net/public/CDDL+GPL.html
or glassfish/bootstrap/legal/LICENSE.txt.  See the License for the specific
language governing permissions and limitations under the License.

When distributing the software, include this License Header Notice in each
file and include the License file at glassfish/bootstrap/legal/LICENSE.txt.
Sun designates this particular file as subject to the "Classpath" exception
as provided by Sun in the GPL Version 2 section of the License file that
accompanied this code.  If applicable, add the following below the License
Header, with the fields enclosed by brackets [] replaced by your own
identifying information: "Portions Copyrighted [year]
[name of copyright owner]"

Contributor(s):

If you wish your version of this file to be governed by only the CDDL or
only the GPL Version 2, indicate your decision by adding "[Contributor]
elects to include this software in this distribution under the [CDDL or GPL
Version 2] license."  If you don't indicate a single choice of license, a
recipient has the option to distribute your version of this file under
either the CDDL, the GPL Version 2 or to extend the choice of license to
its licensees as provided above.  However, if you add GPL Version 2 code
and therefore, elected the GPL Version 2 license, then the option applies
only if the new code is made subject to such option by the copyright
holder.
--><!--
This is the XML DTD for the Servlet 2.3 deployment descriptor.

最佳答案

我不明白为什么这个模块要考虑所有链接的 DTD 文档,因为据我所知,它没有进行有效性检查。

此外,虽然该模块允许更改和添加到文档的节点,但没有明显的方法来删除节点。

但是,您要排除的注释是根节点的子节点,因此可以通过在根节点的唯一元素子节点上重新生成文档来有效地删除它们。

这段代码演示

use strict;
use warnings;
use autodie;
use 5.010;

use XML::XPath;

my $xp   = XML::XPath->new( ioref => *DATA );
my ($new_root) = $xp->findnodes('/*');

print $new_root->toString, "\n";

__DATA__
<!DOCTYPE web-app PUBLIC
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd" >
<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>

输出
<web-app>
  <filter>
    <filter-name>LocaleFilter</filter-name>
  </filter>
</web-app>

关于xml - Perl XML::XPath 在文档中添加一堆垃圾,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23524047/

相关文章:

java - JAXB 中的编码(marshal)参数

perl - 为什么这个 map block 包含一个明显没用的+?

java - 如何通过*默认*命名空间进行 XmlObject.selectPath()?

php - 将 HTML 文本转换为 Leet (1337) Speak with XPath

perl - LWP::Protocol::https 未安装(如何安装?)

xml - 使用 local-name() 获取 XSLT 中的第一个子节点

Android - 如何在 xml 中设置 familiy 字体 "noto sans"?

xml - Xpath 寻找属性

c++ - 从字节数组将 XML 加载到 C++ MSXML

java - 解码 MIME(HTML+附件)