Windows 7 版本Python36-32 代码用途:网站解析
您能建议一下错误的可能原因是什么吗? 我在函数“open”的开头和 (encoding = 'windows_1252', error = 'replace') 中包含了编码 utf-8 它在其他网站的其他类似解析器中帮助了我,但对于这个不起作用
**一段代码:
# cycle through pages
for i in range (count):
s = str (i + 1)
print (s, end = '')
# make url
url = url1 + s + url2 + str (status) + url3
# get html file from server by url
r = requests.get (url)
# open file to save with full path to file name
name = path + 'upload' + s + '.html'
f = open (name, 'w', encoding = 'windows_1252', errors = 'replace')
# save url data to file
f.write (r.text)
# close file
f.close ()
# download files through the list
parseList (name, s + '.html')
print ()
return
错误文本:
Traceback (most recent call last):
File "C:\Users\u6030283\Desktop\FINAM\finam_parser_new.py", line 478, in <module>
parse('list', 'html', 'XS1272198265')
File "C:\Users\u6030283\Desktop\FINAM\finam_parser_new.py", line 262, in parse
f.write(r.text)
File "C:\Users\u6030283\AppData\Local\Programs\Python\Python36-32\lib\encodings\cp1251.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x97' in position 206: character maps to <undefined>
最佳答案
更新:
问题不在于上面的代码(写入文件),而在于 parse()
或 parseList()
方法或读取文件。
替换以下内容
# in parseList(...)
text = open(url, 'r')
# and in parse(..)
text = open(path + file, 'r')
与
# in parseList(...)
text = open(url, 'r', encoding='windows_1252')
# and in parse(..)
text = open(path + file, 'r', encoding='windows_1252')
并且不要忘记将上述问题中的代码恢复到原始状态。
关于python - Unicode编码错误: 'charmap' codec can't encode character '\x97' in position 206: character maps to <undefined>,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53701826/