r - 使用 R 中的 Xpath 根据另一个属性及其父属性获取节点的属性

标签 r xml xpath

我有具有以下结构的 xml 文档,我想根据 Balance 属性 bsdate 值和 BalanceRow 属性 rowNum 值选择 valueStart 和 valueEnd 属性值。例如:

  • bsdate = '2013' 且 rowNum = '200' valueStart 应为 '3000',valueEnd - '4000'

  • bsdate = '2014' 且 rowNum = '102' valueStart 应为 '5500',valueEnd - '6500'

可以用R实现吗?我花了一整天的时间寻找答案,但找不到它

    <Root
    xmlns:a="http://www.w3.org/TR/html4/">
    <Balance bsdate = '2013' bsregdate = '2014.04.01'>
        <BalanceRows type = 'B' rowNum = '100' valueStart = '1000' valueEnd = '2000'></BalanceRows>
        <BalanceRows type = 'B' rowNum = '101' valueStart = '3000' valueEnd = '4000'></BalanceRows>
        <BalanceRows type = 'B' rowNum = '102' valueStart = '5000' valueEnd = '6000'></BalanceRows>
        <BalanceRows type = 'P' rowNum = '200' valueStart = '7000' valueEnd = '8000'></BalanceRows>
        <BalanceRows type = 'P' rowNum = '201' valueStart = '9000' valueEnd = '10000'></BalanceRows>
    </Balance>
    <Balance bsdate = '2014' bsregdate = '2015.04.02'>
        <BalanceRows type = 'B' rowNum = '100' valueStart = '1500' valueEnd = '2500' ></BalanceRows>
        <BalanceRows type = 'B' rowNum = '101' valueStart = '3500' valueEnd = '4500'></BalanceRows>
        <BalanceRows type = 'B' rowNum = '102' valueStart = '5500' valueEnd = '6500'></BalanceRows>
        <BalanceRows type = 'P' rowNum = '200' valueStart = '7500' valueEnd = '8500'></BalanceRows>
        <BalanceRows type = 'P' rowNum = '201' valueStart = '9500' valueEnd = '15000'></BalanceRows>
    </Balance>
</Root>

最佳答案

使用测试数据

library(xml2)
xx <- read_xml("<Root xmlns:a=\"http://www.w3.org/TR/html4/\">
  <Balance bsdate = '2013' bsregdate = '2014.04.01'>
  <BalanceRows type = 'B' rowNum = '100' valueStart = '1000' valueEnd = '2000'></BalanceRows>
  <BalanceRows type = 'B' rowNum = '101' valueStart = '3000' valueEnd = '4000'></BalanceRows>
  <BalanceRows type = 'B' rowNum = '102' valueStart = '5000' valueEnd = '6000'></BalanceRows>
  <BalanceRows type = 'P' rowNum = '200' valueStart = '7000' valueEnd = '8000'></BalanceRows>
  <BalanceRows type = 'P' rowNum = '201' valueStart = '9000' valueEnd = '10000'></BalanceRows>
  </Balance>
  <Balance bsdate = '2014' bsregdate = '2015.04.02'>
  <BalanceRows type = 'B' rowNum = '100' valueStart = '1500' valueEnd = '2500' ></BalanceRows>
  <BalanceRows type = 'B' rowNum = '101' valueStart = '3500' valueEnd = '4500'></BalanceRows>
  <BalanceRows type = 'B' rowNum = '102' valueStart = '5500' valueEnd = '6500'></BalanceRows>
  <BalanceRows type = 'P' rowNum = '200' valueStart = '7500' valueEnd = '8500'></BalanceRows>
  <BalanceRows type = 'P' rowNum = '201' valueStart = '9500' valueEnd = '15000'></BalanceRows>
  </Balance>
  </Root>")

您可以通过以下方式获取匹配行的属性

xml_attrs(xml_find_all(xx, "//Balance[@bsdate = '2014']/BalanceRows[@rowNum='200']"))
# [[1]]
#       type     rowNum valueStart   valueEnd 
#       "P"      "200"     "7500"     "8500

关于r - 使用 R 中的 Xpath 根据另一个属性及其父属性获取节点的属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46287350/

相关文章:

c# - 如何在 C# 中使用 XPath 检索最后一个节点?

c++ - 在 R 包中更改 Mac 与 Linux 的构建选项

r - 将所有列合并为一列而不保留列名

r - 如何在 R 中的错​​误消息中使用特殊字符和颜色?

java - 解析演示中存在大量空数据

php - 如何让谷歌搜索像这样显示我的网站

java - 如何动态更改ListView中项目的大小?

r - 将空图添加到 facet,并与另一个 facet 结合

ios - 使用 XPATH、Swift 从 html 网站提取值

python - 通过 xpath 使用 lxml 解析 html 的问题