python - “CT_Highlight”对象没有属性 'attribute'

我正在尝试从 Word 文档 docx 中读取文本，并尝试查找所有以黄色突出显示的文本，但它给了我一条错误消息

import docx
document = docx.Document(r'C:/Users/devff/Documents/Prac2.docx')
rs = document._element.xpath("//w:r")
WPML_URI = '{http://schemas.openxmlformats.org/wordprocessingml/2006/main}'
tag_rPr = WPML_URI + 'rPr'
tag_highlight = WPML_URI + 'highlight'
tag_val = WPML_URI + 'val'
tag_t = WPML_URI + 't'
for word in rs:
    for rPr in word.findall(tag_rPr):
        high = rPr.findall(tag_highlight)
        for hi in high:
            if hi.attribute[tag_val] == 'yellow':  ##here is the problem
                print(word.find(tag_t).text.encode('utf-8').lower())

理想情况下，它应该打印出突出显示为黄色的文本，但它只是给我:

AttributeError: 'CT_Highlight' object has no attribute 'attribute'

最佳答案

我认为您正在寻找 .attrib，而不是 .attribute。

解决这个问题将使您进入下一步，但是您构建它的方式不太可靠，因为如果不存在 val 属性，它会引发异常。我推荐 _Element.get() https://lxml.de/api/lxml.etree._Element-class.html如果不存在具有请求名称的属性，则仅返回 None:

if hi.get(tag_val) == 'yellow':
    ...

关于python - “CT_Highlight”对象没有属性 'attribute'，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55959113/

上一篇：python - Odoo 8 - 计算字段不使用 self.env 也不搜索

下一篇：python - Pyinstaller:当我使用 pyinstaller 创建 .exe 文件时，使用 python 中的内置函数打开的可执行文件将在 1 秒后关闭

相关文章：

python - Pandas:如果超过一半是 NaN，则删除行和列

python - 如何在 Python 脚本中提供输入答案

python - 在这种情况下如何使用泛函？

c# - 用于更改 Word 文档中的字体的脚本

java - 如何在 Apache POI 中调整图像环绕样式

python - WebSocket JWT Token 连接授权

excel - 检查word文件是否已经打开vba

java - 使用 Apache-POI 获取 docx 的每个段落的行

c# - 以编程方式将 Word (docx) 转换为 PDF

java - 合并 word(docx) 文档与 DOCX4J : how to copy images?