我有具有以下结构的 xml 文档,我想根据 Balance 属性 bsdate 值和 BalanceRow 属性 rowNum 值选择 valueStart 和 valueEnd 属性值。例如:
bsdate = '2013' 且 rowNum = '200' valueStart 应为 '3000',valueEnd - '4000'
bsdate = '2014' 且 rowNum = '102' valueStart 应为 '5500',valueEnd - '6500'
可以用R实现吗?我花了一整天的时间寻找答案,但找不到它
<Root
xmlns:a="http://www.w3.org/TR/html4/">
<Balance bsdate = '2013' bsregdate = '2014.04.01'>
<BalanceRows type = 'B' rowNum = '100' valueStart = '1000' valueEnd = '2000'></BalanceRows>
<BalanceRows type = 'B' rowNum = '101' valueStart = '3000' valueEnd = '4000'></BalanceRows>
<BalanceRows type = 'B' rowNum = '102' valueStart = '5000' valueEnd = '6000'></BalanceRows>
<BalanceRows type = 'P' rowNum = '200' valueStart = '7000' valueEnd = '8000'></BalanceRows>
<BalanceRows type = 'P' rowNum = '201' valueStart = '9000' valueEnd = '10000'></BalanceRows>
</Balance>
<Balance bsdate = '2014' bsregdate = '2015.04.02'>
<BalanceRows type = 'B' rowNum = '100' valueStart = '1500' valueEnd = '2500' ></BalanceRows>
<BalanceRows type = 'B' rowNum = '101' valueStart = '3500' valueEnd = '4500'></BalanceRows>
<BalanceRows type = 'B' rowNum = '102' valueStart = '5500' valueEnd = '6500'></BalanceRows>
<BalanceRows type = 'P' rowNum = '200' valueStart = '7500' valueEnd = '8500'></BalanceRows>
<BalanceRows type = 'P' rowNum = '201' valueStart = '9500' valueEnd = '15000'></BalanceRows>
</Balance>
</Root>
最佳答案
使用测试数据
library(xml2)
xx <- read_xml("<Root xmlns:a=\"http://www.w3.org/TR/html4/\">
<Balance bsdate = '2013' bsregdate = '2014.04.01'>
<BalanceRows type = 'B' rowNum = '100' valueStart = '1000' valueEnd = '2000'></BalanceRows>
<BalanceRows type = 'B' rowNum = '101' valueStart = '3000' valueEnd = '4000'></BalanceRows>
<BalanceRows type = 'B' rowNum = '102' valueStart = '5000' valueEnd = '6000'></BalanceRows>
<BalanceRows type = 'P' rowNum = '200' valueStart = '7000' valueEnd = '8000'></BalanceRows>
<BalanceRows type = 'P' rowNum = '201' valueStart = '9000' valueEnd = '10000'></BalanceRows>
</Balance>
<Balance bsdate = '2014' bsregdate = '2015.04.02'>
<BalanceRows type = 'B' rowNum = '100' valueStart = '1500' valueEnd = '2500' ></BalanceRows>
<BalanceRows type = 'B' rowNum = '101' valueStart = '3500' valueEnd = '4500'></BalanceRows>
<BalanceRows type = 'B' rowNum = '102' valueStart = '5500' valueEnd = '6500'></BalanceRows>
<BalanceRows type = 'P' rowNum = '200' valueStart = '7500' valueEnd = '8500'></BalanceRows>
<BalanceRows type = 'P' rowNum = '201' valueStart = '9500' valueEnd = '15000'></BalanceRows>
</Balance>
</Root>")
您可以通过以下方式获取匹配行的属性
xml_attrs(xml_find_all(xx, "//Balance[@bsdate = '2014']/BalanceRows[@rowNum='200']"))
# [[1]]
# type rowNum valueStart valueEnd
# "P" "200" "7500" "8500
关于r - 使用 R 中的 Xpath 根据另一个属性及其父属性获取节点的属性,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46287350/