python-polars - 极坐标上的 read_json 导致 OutOfSpec 错误

标签 python-polars

我已经开始评估 Polars,与 Pandas 相比,它看起来很棒。我的案例是在“中等”大小的数据上运行数据处理任务,目前看起来非常有前途。 但是,读取 JSON 文件时会导致:

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: OutOfSpec("offsets must not exceed the values length")

通话内容为:

import polars as pr
pr.read_json('./data/yelp_academic_dataset_review.json', json_lines=True)

文件大小为5.0G,取自kaggle Yelp dataset.

我在 Mac 上运行:16GB、2.3 GHz 四核 Intel Core i7、Polars 0.13.58

可能是什么原因? 谢谢

最佳答案

更新:Polars >= 0.13.59

自 Polars 0.13.59 起,此问题已得到修复。您现在可以读取一列中包含超过 2GB 文本的 JSON 文件,因此不再需要下面的解决方法。

还有一个额外的好处,JSON 解析器现在速度更快了。

问题

这似乎不是 RAM 限制,也不是格式错误的输入文件。相反,它似乎是 json_loads 中解析数据量的限制。

我为此投入了我的 Threadripper Pro(具有 512 GB 内存)。如果我将文件读入 RAM:

import polars as pl
from io import StringIO

with open("/tmp/yelp_academic_dataset_review.json") as json_file:
    file_lines = json_file.readlines()

len(file_lines)

我们得到 6,990,280 行。

>>> len(file_lines)
6990280

使用二分搜索,我发现读取前 3,785,593 行是有效的:

pl.read_json(StringIO("".join(file_lines[0:3_785_593])), json_lines=True)
>>> pl.read_json(StringIO("".join(file_lines[0:3_785_593])), json_lines=True)
shape: (3785593, 9)
┌────────────────────────┬──────┬─────────────────────┬───────┬─────┬───────┬─────────────────────────────────────┬────────┬────────────────────────┐
│ business_id            ┆ cool ┆ date                ┆ funny ┆ ... ┆ stars ┆ text                                ┆ useful ┆ user_id                │
│ ---                    ┆ ---  ┆ ---                 ┆ ---   ┆     ┆ ---   ┆ ---                                 ┆ ---    ┆ ---                    │
│ str                    ┆ i64  ┆ str                 ┆ i64   ┆     ┆ f64   ┆ str                                 ┆ i64    ┆ str                    │
╞════════════════════════╪══════╪═════════════════════╪═══════╪═════╪═══════╪═════════════════════════════════════╪════════╪════════════════════════╡
│ XQfwVwDr-v0ZS3_CbbE5Xw ┆ 0    ┆ 2018-07-07 22:09:11 ┆ 0     ┆ ... ┆ 3.0   ┆ If you decide to eat here, just ... ┆ 0      ┆ mh_-eMZ6K5RLWhZyISBhwA │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 7ATYjTIgM3jUlt4UM3IypQ ┆ 1    ┆ 2012-01-03 15:28:18 ┆ 0     ┆ ... ┆ 5.0   ┆ I've taken a lot of spin classes... ┆ 1      ┆ OyoGAe7OKpv6SyGZT5g77Q │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ YjUWPpI6HXG530lwP-fb2A ┆ 0    ┆ 2014-02-05 20:30:30 ┆ 0     ┆ ... ┆ 3.0   ┆ Family diner. Had the buffet. Ec... ┆ 0      ┆ 8g_iMtfSiwikVnbP2etR0A │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ kxX2SOes4o-D3ZQBkiMRfA ┆ 1    ┆ 2015-01-04 00:01:03 ┆ 0     ┆ ... ┆ 5.0   ┆ Wow!  Yummy, different,  delicio... ┆ 1      ┆ _7bHUi9Uuf5__HHc_Q8guQ │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ...                    ┆ ...  ┆ ...                 ┆ ...   ┆ ... ┆ ...   ┆ ...                                 ┆ ...    ┆ ...                    │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ EaqASiPkxV9OUkvsAp4ODg ┆ 0    ┆ 2015-03-17 20:48:03 ┆ 0     ┆ ... ┆ 4.0   ┆ Small hole in the wall, yet plen... ┆ 0      ┆ OPZWPj14g2LQnDWJjMioWQ │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ WbCCGpq_XIr-2_jSXISZKQ ┆ 0    ┆ 2015-08-18 23:26:40 ┆ 1     ┆ ... ┆ 3.0   ┆ Easy street access with adequate... ┆ 0      ┆ 1rPlm6liFDqv8oSmuHSefA │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ld_H5-FpZOWm_tkzwkPYQQ ┆ 0    ┆ 2014-09-25 01:10:49 ┆ 0     ┆ ... ┆ 1.0   ┆ Think twice before staying here.... ┆ 1      ┆ Rz8za5LT_qXBgsL0ice5Qw │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ t0Qyogb4x--K9i5b0AoDCg ┆ 0    ┆ 2017-09-20 14:18:52 ┆ 0     ┆ ... ┆ 5.0   ┆ Reasonably priced, fast friendly... ┆ 0      ┆ uab7_Z8GPeiZ_Un-Jl3fVg │
└────────────────────────┴──────┴─────────────────────┴───────┴─────┴───────┴─────────────────────────────────────┴────────┴────────────────────────┘

但是再读一行,会导致错误:

pl.read_json(StringIO("".join(file_lines[0:3_785_594])), json_lines=True)
>>> pl.read_json(StringIO("".join(file_lines[0:3_785_594])), json_lines=True)
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: OutOfSpec("offsets must not exceed the values length")', /github/home/.cargo/git/checkouts/arrow2-8a2ad61d97265680/c720eb2/src/array/growable/utf8.rs:70:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/corey/.virtualenvs/StackOverflow3.10/lib/python3.10/site-packages/polars/io.py", line 917, in read_json
    return DataFrame._read_json(source, json_lines)
  File "/home/corey/.virtualenvs/StackOverflow3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 818, in _read_json
    self._df = PyDataFrame.read_json(file, json_lines)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: OutOfSpec("offsets must not exceed the values length")

然而,阅读该断点周围的大量记录并没有发现任何特别错误或格式错误的情况。

pl.read_json(StringIO("".join(file_lines[3_785_592:3_785_595])), json_lines=True)
shape: (3, 9)
┌────────────────────────┬──────┬─────────────────────┬───────┬─────┬───────┬─────────────────────────────────────┬────────┬────────────────────────┐
│ business_id            ┆ cool ┆ date                ┆ funny ┆ ... ┆ stars ┆ text                                ┆ useful ┆ user_id                │
│ ---                    ┆ ---  ┆ ---                 ┆ ---   ┆     ┆ ---   ┆ ---                                 ┆ ---    ┆ ---                    │
│ str                    ┆ i64  ┆ str                 ┆ i64   ┆     ┆ f64   ┆ str                                 ┆ i64    ┆ str                    │
╞════════════════════════╪══════╪═════════════════════╪═══════╪═════╪═══════╪═════════════════════════════════════╪════════╪════════════════════════╡
│ t0Qyogb4x--K9i5b0AoDCg ┆ 0    ┆ 2017-09-20 14:18:52 ┆ 0     ┆ ... ┆ 5.0   ┆ Reasonably priced, fast friendly... ┆ 0      ┆ uab7_Z8GPeiZ_Un-Jl3fVg │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ wEdzUMaLE2ebYoe7Z0XGaA ┆ 0    ┆ 2017-07-18 00:16:16 ┆ 0     ┆ ... ┆ 1.0   ┆ I apologize to the readers of Ye... ┆ 0      ┆ tVkr6-lasqKzafoV5K4JfA │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ZF0tt7hn6WK3-aNWgtLcFA ┆ 0    ┆ 2016-08-01 22:08:07 ┆ 0     ┆ ... ┆ 5.0   ┆ Great place. Interesting to see ... ┆ 0      ┆ 9XT2LHohnC8v0T1H4Jxs2Q │
└────────────────────────┴──────┴─────────────────────┴───────┴─────┴───────┴─────────────────────────────────────┴────────┴────────────────────────┘

除了长注释之外,该 strip 中的输入文件中没有任何内容表明存在问题:

head -3785595 yelp_academic_dataset_review.json | tail -3
{"review_id":"kWSOtQvuANZIaCpnb2jNbA","user_id":"uab7_Z8GPeiZ_Un-Jl3fVg","business_id":"t0Qyogb4x--K9i5b0AoDCg","stars":5.0,"useful":0,"funny":0,"cool":0,"text":"Reasonably priced, fast friendly service, delicious Mexican food.  Our go-to place for Mexican takeout in Exton\/Lionville.  They also have tables for dining, you order at the counter.  Exceptional value for high quality fresh food.","date":"2017-09-20 14:18:52"}
{"review_id":"sOOPVuf02-Lz75cTI33KEw","user_id":"tVkr6-lasqKzafoV5K4JfA","business_id":"wEdzUMaLE2ebYoe7Z0XGaA","stars":1.0,"useful":0,"funny":0,"cool":0,"text":"I apologize to the readers of Yelp in advance for the length of this review. However, I felt the need to say what is on my mind. The one star I gave is for the kind and intelligent hostess who needs to be in the manager's position as he does not know how to do his job.  Firstly, this is NOT New York Pizza and the original in Brooklyn should be embarrassed that it bears their namesake. Ordered pizza for pickup. Arrived, got my pizza and went to my car. Opened the box to double check it and it was all sauce, with a very minute amount of \"mozzarella,\" which felt like rubber. I tasted the sauce that was on my finger from when I touched the cheese... horrifically BLAND. What happened next was worse than the bland sauce. I spoke to a very kind hostess, and asked for cheese to be added. She obliged and said they would remake it. The manager, who I see several other people have had issues with, came over and was extremely condescending. He explained that it's because they put sauce on top of the cheese... ok so why was there no cheese under the sauce then either hunny? Why he felt the need to explain to me why my pizza had no cheese is beyond me, especially when the situation had already been rectified. I then asked if that was also why there were burns on the top as well, and he found it amusing and stated \"it's only one burnt bubble...\" (It was waaaay more than one, but ok). Why is ANY PART OF MY FOOD BURNT SIR?! He then felt the need to explain how a coal brick oven works... I'm from Brooklyn, I've had PLENTY of pizza that is cooked this way, like for example, at Grimaldi's in BROOKLYN. When done properly, it doesn't come out BURNT on ANY PART of it. Anyway, I went from wanting cheese to wanting my money back, simply because of the manager's attitude. Which btw my refund was incorrect, but I wanted to leave so badly that I didn't even address that part. THEN he sarcastically offered me a free pizza, after I requested my money back, and when I declined he condescendingly gave me a $25 gift card and his business card. Sweetheart, I wanted cheese not a free meal, which your hostess had already taken care of before your snarky attitude disrupted our peaceful convo, get your life together. This immediately escalated from me allowing this business a chance to create a long time loyal patron and just getting CHEESE, to wanting to never set foot in this place again.  I assume by his smug demeanor that he is accustomed to treating his patrons this way. Anyway, I found a homeless person and gave him the gift card. I can only hope the homeless man wasn't offended by me giving him a gift card for this disgusting place.","date":"2017-07-18 00:16:16"}
{"review_id":"yzgx106UX9OlyBh0tq2G0g","user_id":"9XT2LHohnC8v0T1H4Jxs2Q","business_id":"ZF0tt7hn6WK3-aNWgtLcFA","stars":5.0,"useful":0,"funny":0,"cool":0,"text":"Great place. Interesting to see and learn the history about it. Can get some really cool pictures. Been here a few times and will keep coming back when we're in the area.","date":"2016-08-01 22:08:07"}

即使我尝试剪切文件,也要远离这些记录......

head -3500000 yelp_academic_dataset_review.json > head.json
tail -1000000 yelp_academic_dataset_review.json > tail.json
cat head.json tail.json > try.json

读取 450 万条记录时我们仍然遇到错误...

>>> pl.read_json('/tmp/try.json', json_lines=True)
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: OutOfSpec("offsets must not exceed the values length")', /github/home/.cargo/git/checkouts/arrow2-8a2ad61d97265680/c720eb2/src/array/growable/utf8.rs:70:14
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/corey/.virtualenvs/StackOverflow3.10/lib/python3.10/site-packages/polars/io.py", line 917, in read_json
    return DataFrame._read_json(source, json_lines)
  File "/home/corey/.virtualenvs/StackOverflow3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 818, in _read_json
    self._df = PyDataFrame.read_json(file, json_lines)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: OutOfSpec("offsets must not exceed the values length")

解决方法

如果您将输入文件切成较小的切片,请在较小的切片上使用 read_json 并连接结果,您将获得 DataFrame。

我将在我的机器上模拟如下。 (您可以将文件分割成超过 100 万条记录的更大片段。我只是选择这个数字作为一个简单的数字。)

import polars as pl
from io import StringIO

with open("/tmp/yelp_academic_dataset_review.json") as json_file:
    file_lines = json_file.readlines()

slice_size = 1_000_000
df = pl.concat(
    [
        pl.read_json(
            StringIO("".join(file_lines[offset: (offset + slice_size)])),
            json_lines=True,
        )
        for offset in range(0, len(file_lines), slice_size)
    ]
)
df
shape: (6990280, 9)
┌────────────────────────┬──────┬─────────────────────┬───────┬─────┬───────┬─────────────────────────────────────┬────────┬────────────────────────┐
│ business_id            ┆ cool ┆ date                ┆ funny ┆ ... ┆ stars ┆ text                                ┆ useful ┆ user_id                │
│ ---                    ┆ ---  ┆ ---                 ┆ ---   ┆     ┆ ---   ┆ ---                                 ┆ ---    ┆ ---                    │
│ str                    ┆ i64  ┆ str                 ┆ i64   ┆     ┆ f64   ┆ str                                 ┆ i64    ┆ str                    │
╞════════════════════════╪══════╪═════════════════════╪═══════╪═════╪═══════╪═════════════════════════════════════╪════════╪════════════════════════╡
│ XQfwVwDr-v0ZS3_CbbE5Xw ┆ 0    ┆ 2018-07-07 22:09:11 ┆ 0     ┆ ... ┆ 3.0   ┆ If you decide to eat here, just ... ┆ 0      ┆ mh_-eMZ6K5RLWhZyISBhwA │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 7ATYjTIgM3jUlt4UM3IypQ ┆ 1    ┆ 2012-01-03 15:28:18 ┆ 0     ┆ ... ┆ 5.0   ┆ I've taken a lot of spin classes... ┆ 1      ┆ OyoGAe7OKpv6SyGZT5g77Q │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ YjUWPpI6HXG530lwP-fb2A ┆ 0    ┆ 2014-02-05 20:30:30 ┆ 0     ┆ ... ┆ 3.0   ┆ Family diner. Had the buffet. Ec... ┆ 0      ┆ 8g_iMtfSiwikVnbP2etR0A │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ kxX2SOes4o-D3ZQBkiMRfA ┆ 1    ┆ 2015-01-04 00:01:03 ┆ 0     ┆ ... ┆ 5.0   ┆ Wow!  Yummy, different,  delicio... ┆ 1      ┆ _7bHUi9Uuf5__HHc_Q8guQ │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ...                    ┆ ...  ┆ ...                 ┆ ...   ┆ ... ┆ ...   ┆ ...                                 ┆ ...    ┆ ...                    │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2vLksaMmSEcGbjI5gywpZA ┆ 2    ┆ 2021-03-31 16:55:10 ┆ 1     ┆ ... ┆ 5.0   ┆ This spot offers a great, afford... ┆ 2      ┆ Zo0th2m8Ez4gLSbHftiQvg │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ R1khUUxidqfaJmcpmGd4aw ┆ 0    ┆ 2019-12-30 03:56:30 ┆ 0     ┆ ... ┆ 4.0   ┆ This Home Depot won me over when... ┆ 1      ┆ mm6E4FbCMwJmb7kPDZ5v2Q │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ Rr9kKArrMhSLVE9a53q-aA ┆ 0    ┆ 2022-01-19 18:59:27 ┆ 0     ┆ ... ┆ 5.0   ┆ For when I'm feeling like ignori... ┆ 1      ┆ YwAMC-jvZ1fvEUum6QkEkw │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ VAeEXLbEcI9Emt9KGYq9aA ┆ 7    ┆ 2018-01-02 22:50:47 ┆ 3     ┆ ... ┆ 3.0   ┆ Located in the 'Walking District... ┆ 10     ┆ 6JehEvdoCvZPJ_XIxnzIIw │
└────────────────────────┴──────┴─────────────────────┴───────┴─────┴───────┴─────────────────────────────────────┴────────┴────────────────────────┘

关于python-polars - 极坐标上的 read_json 导致 OutOfSpec 错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73154392/

相关文章:

python-polars - 基于单一条件的多列重新分配

python-3.x - Pandas asof 和 Polars join 中的合并是否相同

python-polars - 如何根据列的倒数第n个元素过滤惰性数据框?

python - 带时间窗口的 Polars 滚动计数

python - Polars Replace_time_zone 函数抛出错误 "no such local time"

pycharm - 如何在 PyCharm 中显示 Polars Dataframe

python - 将一列值分配给极性Python中的另一列

python - 为什么 polars 被称为最快的 dataframe 库,dask 和 cudf 不是更强大吗?

python - Polars/Python 限制打印表输出行数

python-polars - 窗口聚合一个值,但通过 Polars 返回另一个值