python-3.x - 为什么我得到 "UnicodeEncodeError: ' charmap' 编解码器无法在位置 84811 : character maps to <undefined>"error? 编码字符 '\u25b2'

标签 python-3.x web-scraping beautifulsoup encoding

我收到 UnicodeEncodeError: 'charmap' 编解码器无法对位置 756 中的字符 '\u200b' 进行编码:字符映射到 运行此代码时出错::

from bs4 import BeautifulSoup
import requests
r = requests.get('https://stackoverflow.com').text
soup = BeautifulSoup(r, 'lxml')
print(soup.prettify())

输出是:

Traceback (most recent call last):
  File "c:\Users\Asus\Documents\Hello World\Web Scraping\st.py", line 5, in <module>
    print(soup.prettify())
  File "C:\Users\Asus\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u200b' in position 756: character maps to <undefined>

我在 vs 代码中使用 python 3.8.1 和 UTF-8。如何解决这个问题？

最佳答案

你可以自己探索一点......但是对于python 2.7，我通常做的是使用它来清理我的文本:

text = text.encode('utf-8').decode('ascii', 'ignore')

与此等效的 python 3 很简单:

text = str(text)

对于你的情况，试试这个:

r = requests.get('https://stackoverflow.com').text.encode('utf8').decode('ascii', 'ignore')

否则通常:

r = requests.get('https://stackoverflow.com')
soup = BeautifulSoup(r.content, 'lxml')
print soup

(我不认为这应该给出任何错误。)

关于python-3.x - 为什么我得到 "UnicodeEncodeError: ' charmap' 编解码器无法在位置 84811 : character maps to <undefined>"error? 编码字符 '\u25b2'，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62656579/

上一篇：构建运行器的 flutter 生成器问题

下一篇：Flutter web api 调用 XMLHttpRequest 错误

相关文章：

python - BeautifulSoup 无法正常工作

javascript - dryscrape 和 BeautifulSoup 获取 js 渲染的 iframe 中的所有行

Python3 帮助确定动态创建列表的大多数 pythonic 方法

带有正则表达式的 Python 3.7 : Why can I no longer substitute with a string containing a backslash (\)?

python - 删除停用词 - Python

python - 使用 Python 进行网页抓取

python - 在代理服务器后面运行 selenium

python - 尝试在 Python 3 中使用 selenium 获取文本

python - 从 BeautifulSoup 结果集中分离元素

python - 使用 BeautifulSoup 从多个 svg 帧创建循环 svg 动画