python - xpath 与 lxml for Python 获取数据

标签 python xpath lxml

<th><span class="sic_edu_series_popup {keyword : 'EPS_STOCK'}">EPS</span>
          (SGD) <sup class="sic_legend">a
          , j

    </sup></th>
    <td><strong>1.89766</strong></td>
    <th><span class="sic_edu_series_popup {keyword : 'TRAILING_EPS_STOCK'}">Trailing EPS</span>
      (SGD) <sup class="sic_legend">e</sup></th>
    <td><strong>1.87198</strong></td>
    <th><span class="sic_edu_series_popup {keyword : 'NAV_STOCK'}">NAV</span>
      (SGD) <sup class="sic_legend">b</sup></th>
    <td><strong>18.5449</strong></td>
  </tr>

我正在尝试提取“Trailing EPS”的数据以获取数据“1.87198”。这种格式的数据有很多，但名称各异，如 EPS、ROE 等

tree.xpath('//th[contains(normalize-space(span), "EPS")]/sup[@class = "sic_legend"]/td/text()')

我从中一无所获。

最佳答案

td 元素不是 sup 元素的子元素。使用 th 和 td 是 sibling 这一事实:

//th[contains(span, "EPS")]/following-sibling::td/strong/text()

关于python - xpath 与 lxml for Python 获取数据，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40550194/

上一篇：python - Pandas 过滤器日期时间 : TypeError: can't compare offset-naive and offset-aware datetimes

下一篇：python - 使用 OpenPyXl 检查空单元格

python - 使用 python setup.py egg_info 时安装 scikits.audiolab 时出错

python - web2py:一个 View 中的多个表单

java - 使用 Webdriver 在网页中查找输入标签，并 driver.findElements 抛出异常。为什么？

php - 使用 Behat/Selenium 在特定的 DIV 中查找内容

python - 解压缩最大大小 n 的可变长度列表的惯用方法

php - 需要帮助从 Div 中删除空格

python - 如何释放 lxml.etree 使用的内存？

python - 使用lxml在python中解析html和js

python - 使用 lxml 获取 HTML 的所有链接