python - 从xml中提取元素

标签 python xml django lxml

我需要从下面的 xml 中获取值。

我有一个 Django Rest 框架,它提供以下 XML -

<root>
  <list-item>
    <jobmst_id>3493</jobmst_id>
    <jobmst_type>1</jobmst_type>
    <jobmst_prntid/>
    <jobmst_active>N</jobmst_active>
    <evntmst_id>
      <evntmst_id>1</evntmst_id>
      <evntmst_name>Daily                                                       </evntmst_name>
      <evntmst_desc/>
      <evntmst_owner>2</evntmst_owner>
      <evntmst_lstchgtm>2009-12-04 12:28:52</evntmst_lstchgtm>
      <evntmst_lstcmptm>2014-08-28 12:00:29</evntmst_lstcmptm>
      <evntmst_fromdt/>
      <evntmst_untildt>2012-12-31 00:00:00</evntmst_untildt>
      <evntmst_frcstdt>2012-12-31 00:00:00</evntmst_frcstdt>
      <evntmst_type>2</evntmst_type>
      <evntmst_subtype/>
      <evntmst_freq>1</evntmst_freq>
      <evntmst_crttm>2009-12-04 11:22:03</evntmst_crttm>
      <evntmst_totcnt/>
      <evntmst_public>Y</evntmst_public>
      <evntmst_months>YYYYYYYYYYYY</evntmst_months>
      <evntmst_weeks>NNNNN</evntmst_weeks>
      <evntmst_monthdays>YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY</evntmst_monthdays>
      <evntmst_weekdays>YYYYYYY</evntmst_weekdays>
      <evntmst_offset>0</evntmst_offset>
      <evntmst_intsect>N</evntmst_intsect>
      <evntmst_occur>0</evntmst_occur>
      <evntmst_timeframe>0</evntmst_timeframe>
      <evntmst_calendar>0</evntmst_calendar>
      <evntmst_fiscal>0</evntmst_fiscal>
    </evntmst_id>
    <jobmst_evntoffset/>
    <jobmst_name>SWIFT</jobmst_name>
    <jobmst_mode>0</jobmst_mode>
    <jobmst_owner>
      <owner_id>128</owner_id>
      <owner_type>2</owner_type>
      <owner_name>SWIFT                         </owner_name>
      <owner_allagents>Y</owner_allagents>
    </jobmst_owner>
    <jobmst_desc/>
    <jobmst_crttm>2009-04-24 14:17:56</jobmst_crttm>
    <jobdtl_id>
      <jobdtl_id>3493</jobdtl_id>
      <jobdtl_cmd/>
      <jobdtl_envfile/>
      <jobdtl_retnsn>180</jobdtl_retnsn>
      <jobdtl_allowadhoc>Y</jobdtl_allowadhoc>
      <jobdtl_waitop>N</jobdtl_waitop>
      <jobdtl_fromdt>1899-12-30 00:00:00</jobdtl_fromdt>
      <jobdtl_untildt>1899-12-30 00:00:00</jobdtl_untildt>
      <jobdtl_fromtm/>
      <jobdtl_untiltm/>
      <jobdtl_proxy>
        <usrmst_id>4</usrmst_id>
        <usrmst_domain>CPPIB               </usrmst_domain>
        <usrmst_name>svc_tidal</usrmst_name>
        <usrmst_fullname>Tidal User</usrmst_fullname>
        <usrmst_desc/>
        <usrmst_phoneno/>
        <usrmst_pagerno/>
        <usrmst_email/>
        <usrmst_emailtype>0</usrmst_emailtype>
        <secmst_id>1</secmst_id>
        <lngmst_id>1</lngmst_id>
        <usrmst_password>@JrL(OOLO8RSAWKX</usrmst_password>
        <usrmst_externid/>
        <usrmst_suser>Y</usrmst_suser>
        <usrmst_lstchgtm>2013-05-30 12:18:13</usrmst_lstchgtm>
        <usrmst_sappassword/>
        <usrmst_pspassword/>
        <usrmst_aspassword/>
        <usrmst_orapassword/>
        <usrmst_wingroup>N</usrmst_wingroup>
      </jobdtl_proxy>
      <jobdtl_proxy2/>
      <jobdtl_interval/>
      <jobdtl_intervalcnt/>
      <jobdtl_unit/>
      <jobdtl_duration>55826</jobdtl_duration>
      <jobdtl_concur>1</jobdtl_concur>
      <jobdtl_priority>50</jobdtl_priority>
      <jobdtl_minrun>60</jobdtl_minrun>
      <jobdtl_maxrun>60</jobdtl_maxrun>
      <jobdtl_failalarm/>
      <nodmst_id/>
      <nodlstmst_id>
        <nodlstmst_id>27</nodlstmst_id>
        <nodlstmst_name>OPS_SharedWIN                 </nodlstmst_name>
        <nodlstmst_desc/>
        <nodlstmst_type>1</nodlstmst_type>
        <nodlstmst_prntid/>
        <nodlstmst_seq/>
        <nodlstmst_ostype>1</nodlstmst_ostype>
        <nodlstmst_lastused/>
        <nodlstmst_lstchgtm>2014-07-23 15:31:37</nodlstmst_lstchgtm>
        <servicemst_id/>
      </nodlstmst_id>
      <jobdtl_inhevent>N</jobdtl_inhevent>
      <jobdtl_inhoptions>N</jobdtl_inhoptions>
      <jobdtl_inhagent>N</jobdtl_inhagent>
      <jobdtl_inhrepeat>N</jobdtl_inhrepeat>
      <jobdtl_inhtime>N</jobdtl_inhtime>
      <jobdtl_timewin>0</jobdtl_timewin>
      <jobdtl_saveoutput>Y</jobdtl_saveoutput>
      <jobdtl_outputname/>
      <jobdtl_trackmethod>1</jobdtl_trackmethod>
      <jobdtl_trackcmd/>
      <jobdtl_deplogic>1</jobdtl_deplogic>
      <jobdtl_rerun/>
      <jobdtl_params/>
      <jobdtl_sapcount/>
      <jobdtl_normalexit>0</jobdtl_normalexit>
      <jobdtl_normalrange>0</jobdtl_normalrange>
      <jobdtl_normalop>1</jobdtl_normalop>
      <jobdtl_deprerun>N</jobdtl_deprerun>
      <jobdtl_carryover>1</jobdtl_carryover>
      <jobdtl_psjob/>
      <jobdtl_savelogonly>N</jobdtl_savelogonly>
      <jobdtl_trxid>0</jobdtl_trxid>
      <jobdtl_rerunok>Y</jobdtl_rerunok>
      <jobdtl_workdir/>
      <jobdtl_extinfo/>
      <servicemst_id/>
      <jobdtl_estmethod>1</jobdtl_estmethod>
      <jobdtl_nearoutage>3</jobdtl_nearoutage>
      <jobdtl_trackcl/>
      <jobdtl_statuscl/>
      <jobdtl_abrtonclderr/>
      <jobdtl_estdurexclude>4</jobdtl_estdurexclude>
    </jobdtl_id>
    <jobmst_lstchgtm>2009-04-24 14:19:02</jobmst_lstchgtm>
    <jobmst_runbook/>
    <jobcls_id/>
    <jobmst_prntname/>
    <jobmst_alias>3493A     </jobmst_alias>
    <jobmst_dirty> </jobmst_dirty>
    <job_dependencies/>
    <job_events/>
  </list-item>
</root>

我正在编写另一个需要获取 <jobmst_id></jobmst_id> 的应用程序元素值,因为我想在保存文件时使用该值作为文件名。

我已经保存了部分,但我不确定如何获取元素值。

            fullsrcurl = self.srcjson + '?format=xml&jobname=' + job
            fulltrgurl = self.targetjson + '?format=xml&jobname=' + job
            file = urllib2.urlopen(fullsrcurl)
            doc = etree.parse(file)
            data = etree.tostring(doc, pretty_print=True)       

            file = 'c:\\temp\\deployments\\DEV\\test.xml'
            xmlsave = open(file, 'w')
            xmlsave.write(data)
            xmlsave.close

最佳答案

您可以使用文档中的 XPath 语法 here :

root = ET.fromstring(data)
jobmst_id_tag = root.find('./list-item/jobmst_id')
jobmst_id_value = jobmst_id_tag.text
print jobmst_id_value

这给出:

3493

关于python - 从xml中提取元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25576203/

相关文章:

python - 索引错误 : list index out of range in TensorFlow

Python PIL putdata() 方法未保存正确的数据

xml - 从 RDF 到 html 的 XSLT 转换

xml - 使用 XSD 验证 XML 中的自定义日期和时间

django - 使用 django-rest-framework 序列化对象列表

python - 如何制作更高效的代码来搜索 Pandas 列中的多个字符串

java - 尝试使用 Eclipse 制作我的第一个应用程序时出错

python - 如何在 Django 分页中加载 Stripe 对象

django - CodeDeploy 运行过时的 appspec 文件?

python - 使用 Python 的 Flask 为 URL 创建 "catch all"