我正在尝试创建一个没有标题的重复 CSV。当我尝试这样做时,我收到以下错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 1895: invalid start byte.
我已经阅读了关于
CSV
和 Unicode
编码的 python UTF-8
documentation 并实现了它。但是,生成的输出文件中没有数据。不确定我在这里做错了什么。
import csv
path = '/Users/johndoe/file.csv'
with open(path, 'r') as infile, open(path + 'final.csv', 'w') as outfile:
def unicode_csv(infile, outfile):
inputs = csv.reader(utf_8_encoder(infile))
output = csv.writer(outfile)
for index, row in enumerate(inputs):
yield [unicode(cell, 'utf-8') for cell in row]
if index == 0:
continue
output.writerow(row)
def utf_8_encoder(infile):
for line in infile:
yield line.encode('utf-8')
unicode_csv(infile, outfile)
最佳答案
解决方案是简单地将两个附加参数添加到
with open(path, 'r') as infile:
这两个参数是 encoding='UTF-8' 和 errors='ignore'。这使我可以创建没有标题和 UnicodeDecodeError 的原始 CSV 的副本。下面是完成的代码。
import csv
path = '/Users/johndoe/file.csv'
with open(path, 'r', encoding='utf-8', errors='ignore') as infile, open(path + 'final.csv', 'w') as outfile:
inputs = csv.reader(infile)
output = csv.writer(outfile)
for index, row in enumerate(inputs):
# Create file with no header
if index == 0:
continue
output.writerow(row)
关于python - 在 Python 中将 CSV 转换为 UTF-8,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32403209/