python - 多行 jsons 的 pandas read_json 返回 JSONReader 而不是数据帧

我有一个文件file1.json，其内容如下(每个字典在单独的行中):

{"a":1,"b":2}
{"c":3,"d":4}
{"e":9,"f":6}
.
.
.
{"u":31,"v":23}
{"w":87,"x":46}
{"y":98,"z":68}

我想将此文件加载到 pandas 数据框中，所以这就是我所做的:

df = pd.read_json('../Dataset/file1.json', orient='columns', lines=True, chunksize=10)

但这不是返回数据帧而是返回 JSONReader。

[IN]: df
[OUT]: <pandas.io.json.json.JsonReader at 0x7f873465bd30>

这正常吗，还是我做错了什么？如果这就是当单个 json 文件中有多个字典(没有任何逗号分隔)并且每个字典在单独的行中时 read_json() 应该如何表现，那么我如何才能最好地将它们放入数据帧中？

编辑: 如果我从 read_json() 中删除 chunksize 参数，这就是我得到的:

[IN]: df = pd.read_json('../Dataset/file1.json', orient='columns', lines=True)
[OUT]: ValueError: Expected object or value

最佳答案

如the docs解释一下，这正是 chunksize 参数的要点:

chunksize: integer, default None

Return JsonReader object for iteration. See the line-delimted json docs for more information on chunksize. This can only be passed if lines=True. If this is None, the file will be read into memory all at once.

链接的文档说:

For line-delimited json files, pandas can also return an iterator which reads in chunksize lines at a time. This can be useful for large files or to read from a stream.

...然后给出如何使用它的示例。

如果你不想这样，为什么要传递chunksize？把它去掉就可以了。

关于python - 多行 jsons 的 pandas read_json 返回 JSONReader 而不是数据帧，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50383910/

上一篇：python - django-rest-framework 中的 PDF

下一篇：python - django 中未显示点赞数

相关文章：

r - 当在另一行中发现元素逗号分隔时合并行

python - Flask Json json.decoder.JSONDecodeError : Expecting value: line 1 column 1 (char 0)

python - Django 附加 <queryset> 使用过滤器搜索句子中的每个单词

python-3.x - `xlsxwriter` 除非更改颜色，否则线条透明度不起作用

python - 尝试将 pandas 数据框保存到现有 Excel 工作表时出现 AttributeError

python - 如果数据框中大多数列相等，则 Pandas 设置值

python - 编写 python 函数以从 pandas 数据框中提取匹配行

用子字符串替换数据框的行名

python - 从列表列表到字典

python - 如何获取字典中最大值的键，如果有重复，还返回最大数字键