python - lxml - 当文件名的值相同时，从循环/迭代 Excel 行中保存 xml 会导致错误

我有以下问题:当我循环遍历 Excel 行并将每一行保存在 xml 文件中时，它会起作用，除非文件命名(名称来自 M 列)具有相同的值。那么 xml 文件当然会被覆盖

从两天开始就完成所有事情

图中负责文件命名的列是 M 列。

我知道我需要以某种方式为这种情况添加一个 if 语句。

我的想法是在这种情况下为此创建一个 xml，并放入 2 个“accountsPayableLedger”及其内部值，并将属性“consolidatedAmount”下的“consildate”中的金额相加。

感谢您提前提供的所有帮助，非常感谢。

def makeroot():
    return etree.Element("LedgerImport")


####open excel file speadsheet
wb = openpyxl.load_workbook('import_spendesk_datev.xlsx')
sheet = wb['Import']

# build the xml tree
for i in range(2,6):
        xmlRoot = makeroot()
        #consolidate = etree.SubElement(xmlRoot, 'consolidate', attrib={'consolidatedAmount': str(sheet.cell(row=i,column=16).value),'consolidatedDate': str(sheet.cell(row=i,column=2).value), 'consolidatedInvoiceId': str(sheet.cell(row=i,column=13).value), 'consolidatedCurrencyCode': str(sheet.cell(row=i,column=12).value) })
        accountsPayableLedger = etree.SubElement(consolidate, 'accountsPayableLedger')
        account = etree.SubElement(accountsPayableLedger, 'bookingText')
        account.text = sheet.cell(row=i,column=21).value
        doc = etree.ElementTree(xmlRoot)
        doc.write(str(sheet.cell(row=i,column=13).value)+".xml", xml_declaration=True, encoding='utf-8', pretty_print=True)

如果 2 行具有相同的值，这应该是期望的结果

<?xml version='1.0' encoding='UTF-8'?>
<LedgerImport xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xml.datev.de/bedi/tps/ledger/v040" generating_system="DATEV manuell" generator_info="DATEV Musterdaten" version="4.0" xsi:schemaLocation="http://xml.datev.de/bedi/tps/ledger/v040 Belegverwaltung_online_ledger_import_v040.xsd">
  <consolidate consolidatedAmount="2000">
    <accountsPayableLedger>
      <bookingText>amazon</bookingText>
      <invoiceId>1</invoiceId>
      <amount>500</amount>
    </accountsPayableLedger>
    <accountsPayableLedger>
      <bookingText>amazon 2</bookingText>
      <invoiceId>2</invoiceId>
    </accountsPayableLedger>
     <amount>1500</amount>
  </consolidate>
</LedgerImport>

最佳答案

如果您有重复的行，并且希望将它们分组到单个输出 Element 或 XML 文件中，我认为一种方法是首先按该单元格值对 Excel 电子表格中的行进行分组，然后然后确保您总结了该值:

from lxml import etree
import openpyxl
import itertools as it

wb = openpyxl.load_workbook(r'import_spendesk_datev.xlsx')
sheet = wb['Import']

sortedrows = sorted(list(sheet.rows)[2:6], key = lambda r: r[12].value)

for k, rowGroup in it.groupby(sortedrows, key = lambda r: r[12].value):
  root = etree.Element("LedgerImport", attrib = { "name" : str(k) })
  rows = list(rowGroup)
  if len(rows) > 1:
      consolidated = etree.Element("consolidate", attrib = { "amount" : str(sum(row[15].value for row in rows )) })
      for row in rows:
          etree.SubElement(consolidated, "accountsPayableLedger", attrib = { "amount" : str(row[1].value) })
      root.append(consolidated)
  else:
      etree.SubElement(root, "accountsPayableLedger", attrib = { "amount" : str(rows[0][15].value) })
  etree.dump(root) # write out the root to a file wanted instead

我不确定确切的单元格索引，您需要调整它们。

关于python - lxml - 当文件名的值相同时，从循环/迭代 Excel 行中保存 xml 会导致错误，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57710168/

python - lxml - 当文件名的值相同时，从循环/迭代 Excel 行中保存 xml 会导致错误

上一篇：Python - 解析串行数据并查找特定字符串

下一篇：python - 如何限制正则表达式结果？