python - 使用 PETL 解析 XML

标签 python xml parsing etl petl

我正在尝试使用 PETL 包在 Python 中解析以下 XML 代码

<Msg_file LeagueID="00" League="NBA" Season="2012-13" SeasonType="Regular Season">
<Game Number="0">
<Msg_Roster>
  <Player_info Person_id="2734" Team_id="1610612737" Player_status="A" First_name="Devin" Last_name="Harris" Jersey_number="34" Birth_date="February 27, 1983" Height="6'3&quot;" Weight="192" Position="G" School="Wisconsin" SchoolType="College" Country="USA" Display_affiliation="Wisconsin/USA" DraftYear="2004" FreeAgent="N" SeasonExp="8" PlayerCode="devin_harris"></Player_info>
  <Player_info Person_id="201143" Team_id="1610612737" Player_status="A" First_name="Al" Last_name="Horford" Jersey_number="15" Birth_date="June 03, 1986" Height="6'10&quot;" Weight="250" Position="C-F" School="Florida" SchoolType="College" Country="Dominican Republic" Display_affiliation="Florida/Dominican Republic" DraftYear="2007" FreeAgent="N" SeasonExp="5" PlayerCode="al_horford"></Player_info>
  <Player_info Person_id="203098" Team_id="1610612737" Player_status="A" First_name="John" Last_name="Jenkins" Jersey_number="12" Birth_date="March 06, 1991" Height="6'4&quot;" Weight="215" Position="G" School="Vanderbilt" SchoolType="College" Country="USA" Display_affiliation="Vanderbilt/USA" DraftYear="2012" FreeAgent="N" SeasonExp="0" PlayerCode="john_jenkins"></Player_info>
  <Player_info Person_id="201274" Team_id="1610612737" Player_status="A" First_name="Ivan" Last_name="Johnson" Jersey_number="44" Birth_date="April 10, 1984" Height="6'8&quot;" Weight="255" Position="F" School="Cal State San Bernardino" SchoolType="College" Country="USA" Display_affiliation="Cal State San Bernardino/USA" DraftYear="2011" FreeAgent="N" SeasonExp="1" PlayerCode="ivan_johnson"></Player_info>
  <Player_info Person_id="2563" Team_id="1610612737" Player_status="A" First_name="Dahntay" Last_name="Jones" Jersey_number="30" Birth_date="December 27, 1980" Height="6'6&quot;" Weight="225" Position="F" School="Duke" SchoolType="College" Country="USA" Display_affiliation="Duke/USA" DraftYear="2003" FreeAgent="N" SeasonExp="9" PlayerCode="dahntay_jones"></Player_info>
  <Player_info Person_id="2594" Team_id="1610612737" Player_status="A" First_name="Kyle" Last_name="Korver" Jersey_number="26" Birth_date="March 17, 1981" Height="6'7&quot;" Weight="212" Position="F-G" School="Creighton" SchoolType="College" Country="USA" Display_affiliation="Creighton/USA" DraftYear="2003" FreeAgent="N" SeasonExp="9" PlayerCode="kyle_korver"></Player_info>
 </Msg_Roster>
</Game>
</Msg_file>

我在 PETL 中使用以下代码:

import petl as etl
table2 = etl.fromxml('nba_rosters.xml','player_info','playercode')

我收到一条错误消息:

Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
 table2
File "C:\Python\Python36-32\lib\idlelib\rpc.py", line 617, in displayhook
 text = repr(value)
File "C:\Python\Python36-32\lib\site-packages\petl\util\vis.py", line 135, 
 in _table_repr
return str(look(table))
File "C:\Python\Python36-32\lib\site-packages\petl\util\vis.py", line 122, 
 in __repr__
truncate=truncate, width=width)
File "C:\Python\Python36-32\lib\site-packages\petl\util\vis.py", line 197, in _look_grid
hdr = next(it)
StopIteration

任何关于如何正确解析此文件的想法都会有巨大的帮助。我是 Python 新手,可以成功解析 PETL 文档提供的示例文件,但无法将其转换为实际案例使用。

最佳答案

您的按键中有一些拼写错误,您需要另一个参数:

代码:

import petl as etl
table2 = etl.fromxml('nba_rosters.xml', 'Msg_Roster', 'Player_info', 'PlayerCode')
print(table2)

结果:

+--------------+------------+--------------+--------------+---------------+-------------+
| devin_harris | al_horford | john_jenkins | ivan_johnson | dahntay_jones | kyle_korver |
+==============+============+==============+==============+===============+=============+

关于python - 使用 PETL 解析 XML,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49724045/

相关文章:

python - 我的第一个 Python GUI 程序生成错误

c++ - 使用字符串流解析字符串时,它会提取换行符

java - Log4j2 Gmail SMTP 附加程序

c# - 我如何使用 XPathNavigator 访问 XML Child 的索引?

node.js - Node Body Parser 和 cookie parser 有什么作用?我应该使用它们吗?

Java LocalDate 解析

python - 艰难地学习 Python 练习 48 帮助

python - 如何使用 pandas 模块合并(即 'concat' )100+ .csv 文件?

python - 如何使用正则表达式删除字符串上嵌套文本周围的图案文本?

c# - 如何在 C# 中使用 linq to xml 将属性设置为 XML 元素