python - 如何使用 Python 替换 HTML 转义字符？

<分区>

Possible Duplicate:
Decode HTML entities in Python string?

我有一个充满 HTML 转义字符的字符串，例如 "、” 和 —。

是否有任何 Python 库为我提供了将所有这些转义字符替换为它们各自的实际字符的可靠方法？

例如，我希望所有 " 都替换为 "s。

最佳答案

你想用这个:

try:
    from html.parser import HTMLParser  # Python 3
except ModuleNotFoundError:
    from HTMLParser import HTMLParser  # Python 2
parser = HTMLParser()
html_decoded_string = parser.unescape(html_encoded_string)

我也看到了很多人对 BeautifulSoup 的喜爱

from BeautifulSoup import BeautifulSoup
html_decoded_string = BeautifulSoup(html_encoded_string, convertEntities=BeautifulSoup.HTML_ENTITIES)

还有这些现有问题的重复:

Decode HTML entities in Python string?

Decoding HTML entities with Python

Decoding HTML Entities With Python

关于python - 如何使用 Python 替换 HTML 转义字符？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/11405996/

上一篇：python - 在 python 中的重定向中将变量传递给模板

下一篇：python - 我如何在 numpy 中将这个三重循环向量化为二维数组？

相关文章：

python - col 函数如何知道我们引用的是哪个 DataFrame？

python - 使用 float 时 SQLAlchemy filter_by 查询失败

python - 在 django admin 中按相关字段搜索

python - 为什么列表中的第一个图没有绘制，但最后却有一个空图？

python - 如何使用 Python 和 PyGame 在 OpenGL 上显示 2D 形状

javascript - 在 Bootstrap 中向菜单添加操作

python - 向量化 numpy 循环

python - 在 docker 中运行时无法从 python 连接到 influxdb

php - 将网站导出到 XML 页面

python - Redis pubsub 给出错误的结果