python - 如何在批处理/Python中编辑XML文件

标签 python xml batch-file vbscript fetch

我正在尝试编辑 xml批处理/python 脚本中的文件

这是我的 xml 文件:

<?xml version="1.0" encoding="UTF-8"?>
<task name="analyse">
   <taskInfo taskId="21a09311-ade3-4e9a-af21-d13be8b7ba45" runAt="2015-05-20 13:48:50" runTime="5 minutes, 53 seconds">
      <project name="13955 - HMI Volvo Truck PA15" number="e20d51c0-71dc-4572-8f9b-4c150bf35222" />
      <language lcid="1031" name="German (Germany)" />
      <tm name="ENG-DEU_en-GB_de-DE.sdltm" />
      <settings reportInternalFuzzyLeverage="yes" reportLockedSegments="no" reportCrossFileRepetitions="yes" minimumMatchScore="70" searchMode="bestWins" missingFormattingPenalty="1" differentFormattingPenalty="1" multipleTranslationsPenalty="1" autoLocalizationPenalty="0" textReplacementPenalty="0" />
   </taskInfo>
   <file name="VT MAIN TRACK_PA15_Default_DE-DE_20150520_102527.xlf.sdlxliff" guid="111f9ba6-82f6-45fb-ac49-8bf6cf57c169">
      <analyse>
         <perfect segments="0" words="0" characters="0" placeables="0" tags="0" />
         <inContextExact segments="60" words="55" characters="755" placeables="3" tags="0" />
         ' Replace the Value word="55" with "0"
         <exact segments="114" words="334" characters="1687" placeables="14" tags="3" />
         <locked segments="0" words="0" characters="0" placeables="0" tags="0" />
         <crossFileRepeated segments="2" words="20" characters="0" placeables="0" tags="0" />
         'Cut the value words="20" replace with 0
         <repeated segments="17" words="34" characters="293" placeables="2" tags="0" />
         'add the value to current value 20 to 34  so the new value is words="54"
         <total segments="449" words="1462" characters="7630" placeables="66" tags="24" />
         <new segments="126" words="434" characters="2384" placeables="18" tags="5" />
         <fuzzy min="75" max="84" segments="25" words="108" characters="528" placeables="6" tags="3" />
         <fuzzy min="85" max="94" segments="23" words="92" characters="454" placeables="7" tags="4" />
         <fuzzy min="95" max="99" segments="77" words="260" characters="1318" placeables="13" tags="6" />
         <internalFuzzy min="75" max="84" segments="3" words="16" characters="100" placeables="2" tags="2" />
         <internalFuzzy min="85" max="94" segments="4" words="25" characters="111" placeables="1" tags="1" />
         <internalFuzzy min="95" max="99" segments="0" words="0" characters="0" placeables="0" tags="0" />
      </analyse>
   </file>
   <file name="VT MAIN TRACK_PA15_Default_DE-DE_20150523_254796.xlf.sdlxliff" guid="111f9ba6-82f6-45fb-ac49-8bf6cf57c169">
      <analyse>
         <perfect segments="0" words="0" characters="0" placeables="0" tags="0" />
         <inContextExact segments="60" words="67" characters="755" placeables="3" tags="0" />
         ' Replace the Value word="67" with "0"
         <exact segments="114" words="334" characters="1687" placeables="14" tags="3" />
         <locked segments="0" words="0" characters="0" placeables="0" tags="0" />
         <crossFileRepeated segments="2" words="35" characters="0" placeables="0" tags="0" />
         'Cut the value words="35" replace with 0
         <repeated segments="17" words="54" characters="293" placeables="2" tags="0" />
         'add the value to current value 35 to 54  so the new value is words="89"
         <total segments="449" words="1462" characters="7630" placeables="66" tags="24" />
         <new segments="126" words="434" characters="2384" placeables="18" tags="5" />
         <fuzzy min="75" max="84" segments="25" words="108" characters="528" placeables="6" tags="3" />
         <fuzzy min="85" max="94" segments="23" words="92" characters="454" placeables="7" tags="4" />
         <fuzzy min="95" max="99" segments="77" words="260" characters="1318" placeables="13" tags="6" />
         <internalFuzzy min="75" max="84" segments="3" words="16" characters="100" placeables="2" tags="2" />
         <internalFuzzy min="85" max="94" segments="4" words="25" characters="111" placeables="1" tags="1" />
         <internalFuzzy min="95" max="99" segments="0" words="0" characters="0" placeables="0" tags="0" />
      </analyse>
   </file>
   <batchTotal>
      <analyse>
         <perfect segments="0" words="0" characters="0" placeables="0" tags="0" />
         <inContextExact segments="60" words="139" characters="755" placeables="3" tags="0" />
         <exact segments="114" words="334" characters="1687" placeables="14" tags="3" />
         <locked segments="0" words="0" characters="0" placeables="0" tags="0" />
         <crossFileRepeated segments="0" words="0" characters="0" placeables="0" tags="0" />
         <repeated segments="17" words="54" characters="293" placeables="2" tags="0" />
         <total segments="449" words="1462" characters="7630" placeables="66" tags="24" />
         <new segments="126" words="434" characters="2384" placeables="18" tags="5" />
         <fuzzy min="75" max="84" segments="25" words="108" characters="528" placeables="6" tags="3" />
         <fuzzy min="85" max="94" segments="23" words="92" characters="454" placeables="7" tags="4" />
         <fuzzy min="95" max="99" segments="77" words="260" characters="1318" placeables="13" tags="6" />
         <internalFuzzy min="75" max="84" segments="3" words="16" characters="100" placeables="2" tags="2" />
         <internalFuzzy min="85" max="94" segments="4" words="25" characters="111" placeables="1" tags="1" />
         <internalFuzzy min="95" max="99" segments="0" words="0" characters="0" placeables="0" tags="0" />
      </analyse>
   </batchTotal>
</task>

一般说明:

  • <task>是根元素(结束元素 </task> )
  • 这里重要的是修改名为文件<file>的部分中的一些标签。和结束标记</file>
  • 可能会出现 X 次 <file>*</file>

我需要什么,

对于每个 <file>元素,我想要:

  • <inContextExact> ,设置属性的值words与 0

    <inContextExact ... words="55" ... /> => <inContextExact ... words="0" ... />

  • <crossFileRepeated> ,设置属性的值words与 0

    <crossFileRepeated ... words="20" ... /> => <crossFileRepeated ... words="0" ... />

  • <total> ,设置words的值由我自己的逻辑计算的属性

    <total ... words="1462" ... /> => <total ... words="??" ... />

我真的很欣赏在批处理/Python 中处理 XML 文件的示例

最佳答案

让我们使用Python!

在 python 中做到这一点非常容易。既然你说可以用 python 来解决,请检查下面的脚本。

以下介绍了如何迭代包含 xml 文件的目录,并在保存文件更改的同时在 python 中按请求处理它们

from xml.etree import ElementTree
import os

def edit_xml_file(data):
    e = ElementTree.fromstring(data)

    for file_element in e.findall('file'):

        analyse_element = file_element.find('analyse')

        in_context_exact_element = analyse_element.find('inContextExact')
        in_context_exact_words = int(in_context_exact_element.get('words'))
        in_context_exact_element.set('words', '0')

        cross_file_repeated_element = analyse_element.find('crossFileRepeated')
        cross_file_repeated_words = int(cross_file_repeated_element.get('words'))
        cross_file_repeated_element.set('words', '0')

        total_element = analyse_element.find('total')
        total_element.set('words', str(in_context_exact_words + cross_file_repeated_words))

    xmlstr = ElementTree.tostring(e)
    return xmlstr


def main():

    source_directory = 'xmlfiles'

    for filename in os.listdir(source_directory):

        if not filename.endswith('.xml'):
            continue

        xml_file_path = os.path.join(source_directory, filename)
        with open(xml_file_path, 'r+b') as f:
            data = f.read()
            fixed_data = edit_xml_file(data)
            f.seek(0)
            f.write(fixed_data)
            f.truncate()


if __name__ == '__main__':
    main()

在此解决方案中,iv'e 使用了 the built in ElementTree utility

关于python - 如何在批处理/Python中编辑XML文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30526125/

相关文章:

python3 tkinter 网格和打包,内联打包语法和优雅

mysql - 存储和索引 1M+ XML 文档的最佳实践?

batch-file - 如何批量检查多个VAR是否存在(从VAR1到VAR10)

python - 如何将列名添加到每个 Pandas 值中?

python - 如何防止 django 测试显示 sys.stdout 消息?

python - 在 Python : how to know how much of the input file has been read? 中使用 lxml 迭代 XML

java - 从 gradle 实现中删除 ActionBarSharelock 后出现 "ActionBarSherlock"错误

batch-file - 我的批处理文件中的用户输入问题

Windows 批处理文件无法定位文件

python - 使用Python从HDFS目录中读取文件并在Spark中创建RDD