python - csv.reader 从请求流 : iterator should return strings, 中读取而不是字节

标签 python django python-3.x csv python-requests

我正在尝试使用 requests.get(url, stream=True) 将响应流式传输到 csv.reader 以处理相当大的数据源。我的代码适用于 python2.7。这是代码:

response = requests.get(url, stream=True)
ret = csv.reader(response.iter_lines(decode_unicode=True), delimiter=delimiter, quotechar=quotechar,
    dialect=csv.excel_tab)
for line in ret:
    line.get('name')

不幸的是，在迁移到 python3.6 之后，我收到了以下错误:

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

我试图找到一些包装器/装饰器来将 response.iter_lines() 迭代器的结果从字节转换为字符串，但没有成功。我已经尝试使用 io 包和 codecs。使用 codecs.iterdecode 不会按行拆分数据，它可能只是按 chunk_size 拆分，在这种情况下，csv.reader 以下列方式提示:

_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

最佳答案

我猜你可以将其包装在 genexp 中并向其提供解码行:

from contextlib import closing

with closing(requests.get(url, stream=True)) as r:
    f = (line.decode('utf-8') for line in r.iter_lines())
    reader = csv.reader(f, delimiter=',', quotechar='"')
    for row in reader:
        print(row)

使用 3.5 中的一些样本数据这会关闭 csv.reader，输入它的每一行都首先在 genexp 中被解码。另外，我正在使用 closing来自 contextlib原样generally suggested自动关闭响应。

关于python - csv.reader 从请求流 : iterator should return strings, 中读取而不是字节，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39619676/

上一篇：python - pd.read_csv 忽略没有标题的列

下一篇：python - numpy 2d 区域的快速随机到唯一重新标记(无循环)

相关文章：

python - 使用 django 按类别(多对多字段)过滤内容

python - Django manage.py syncdb 抛出没有名为 MySQLdb 的模块

python-3.x - Python copy_expert 加载带有空值的数据时出现问题

jquery - 为什么我来自 Django 的 JSON 在大约 2.1MB 时被截断？

python - 如何将视频转换为 numpy 数组？

python - 正则表达式给出元组而不是完全匹配

python以健壮的方式写入网络文件

c++ - 使用 OpenCV 从捕获设备获取所有可用的帧大小

python - 从 SyntaxNet 获取输出作为 python 对象，而不是文本

python - 使用folium创建 map 后HTML页面为空白