我需要删除 XML 文件的某些部分,例如这个文件:
<dict>
<key>Images</key>
<array>
<dict>
<key>ImageIndex</key>
<integer>0</integer>
<key>NumberOfROIs</key>
<integer>42</integer>
<key>ROIs</key>
<array>
<dict>
<key>Area</key>
<real>0.0</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>0.0</real>
<key>IndexInImage</key>
<integer>0</integer>
<key>Max</key>
<real>1358</real>
<key>Mean</key>
<real>1358</real>
<key>Min</key>
<real>1358</real>
<key>Name</key>
<string>Calcification</string>
<key>NumberOfPoints</key>
<integer>1</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(2964.620117, 3427.979980)</string>
</array>
<key>Total</key>
<real>1358</real>
<key>Type</key>
<integer>19</integer>
</dict>
<dict>
<key>Area</key>
<real>0.0</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>0.0</real>
<key>IndexInImage</key>
<integer>1</integer>
<key>Max</key>
<real>1401</real>
<key>Mean</key>
<real>1401</real>
<key>Min</key>
<real>1401</real>
<key>Name</key>
<string>Calcification</string>
<key>NumberOfPoints</key>
<integer>1</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(2993.159912, 3403.550049)</string>
</array>
<key>Total</key>
<real>1401</real>
<key>Type</key>
<integer>19</integer>
</dict>
<dict>
<key>Area</key>
<real>1.3665732145309448</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>66.487342834472656</real>
<key>IndexInImage</key>
<integer>36</integer>
<key>Max</key>
<real>1836</real>
<key>Mean</key>
<real>1583.29638671875</real>
<key>Min</key>
<real>1313</real>
<key>Name</key>
<string>Mass</string>
<key>NumberOfPoints</key>
<integer>89</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(3196.290039, 1048.599976)</string>
<string>(3203.560059, 1046.170044)</string>
<string>(3211.330078, 1042.780029)</string>
<string>(3189.500000, 1050.540039)</string>
</array>
<key>Total</key>
<real>44457380</real>
<key>Type</key>
<integer>15</integer>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
我想删除
<dict>
<key>Images</key>
<array>
<dict>
<key>ImageIndex</key>
<integer>0</integer>
<key>NumberOfROIs</key>
<integer>42</integer>
<key>ROIs</key>
<array>
<dict>
<key>Area</key>
<real>1.3665732145309448</real>
<key>Center</key>
<string>(0.000000, 0.000000, 0.000000)</string>
<key>Dev</key>
<real>66.487342834472656</real>
<key>IndexInImage</key>
<integer>36</integer>
<key>Max</key>
<real>1836</real>
<key>Mean</key>
<real>1583.29638671875</real>
<key>Min</key>
<real>1313</real>
<key>Name</key>
<string>Mass</string>
<key>NumberOfPoints</key>
<integer>89</integer>
<key>Point_mm</key>
<array>
<string>(0.000000, 0.000000, 0.000000)</string>
<string>(0.000000, 0.000000, 0.000000)</string>
</array>
<key>Point_px</key>
<array>
<string>(3196.290039, 1048.599976)</string>
<string>(3203.560059, 1046.170044)</string>
<string>(3211.330078, 1042.780029)</string>
<string>(3189.500000, 1050.540039)</string>
</array>
<key>Total</key>
<real>44457380</real>
<key>Type</key>
<integer>15</integer>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
这是我试过的:
data = r"C:\Users\vinc\Desktop\ExemploXML.xml"
import xml.etree.ElementTree as ET
tree = ET.parse(data)
root = tree.getroot()
for e in root.findall(".//string"):
if e.text == 'Calcification':
print(e)
root.remove(e)
else:
pass
tree.write(r'C:\Users\vinc\Desktop\out.xml')
结果 ======================================
<Element 'string' at 0x000002B085002EA0>
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-d417d00038ed> in <module>
8
9 print(e)
---> 10 root.remove(e)
11 else:
12 pass
ValueError: list.remove(x): x not in list
对于上下文,那些XML文件是语义分割信息,我想去除钙化类注释。
最佳答案
这是基于 XSLT 的解决方案。
下面的 XSLT 遵循所谓的Identity Transform 模式。
单行模板删除不需要<dict>
元素:
<xsl:template match="dict[string='Calcification']"/>
How to transform an XML file using XSLT in Python?
XSLT
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="utf-8" indent="yes" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="dict[string='Calcification']"/>
</xsl:stylesheet>
关于python - 如何删除 XML 文件的一部分?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70442605/