python - 鉴于文本之间有很多空格，如何将文本行拆分为多行？

我有一个包含一列“文本”的数据框:

text
I love cakes    we should make them
Joe is very late            will there be photography?
you should wright code correctly  it is very important

如果文本之间有 2 个或更多空格，我想分解这些行。所以期望的输出是:

text
I love cakes    
we should make them
Joe is very late            
will there be photography?
you should wright code correctly  
it is very important

我知道我可以这样做: df["text"].apply(lambda x: x.split("")) 但我不想在 split 中指定每个数量空格 (df["text"].apply(lambda x: x.split("")), df["text"].apply(lambda x: x.split("")), df["text"].apply(lambda x: x.split("")), .....。我想要 2 个以上的空格条件。我该怎么做？

最佳答案

您可以通过正则表达式进行拆分，然后分解列

df = df['text'].str.split(r'\s{2,}').explode().reset_index().drop("index", 1)

输出

                               text
0                      I love cakes
1               we should make them
2                  Joe is very late
3        will there be photography?
4  you should wright code correctly
5              it is very important

关于python - 鉴于文本之间有很多空格，如何将文本行拆分为多行？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/74025257/

上一篇：javascript - 在其版本 1.x 中使用 Axios paramsSerializer 的正确方法

下一篇：.net - 连接已成功建立...(提供程序 : SSL Provider, 错误:31 - 加密(ssl/tls)握手失败)

python-3.x - 使用 PIP 安装时遇到问题

python-3.x - 如何在python3环境中执行scons？

scala - Spark数据帧将列值更改为时间戳

r - dplyr中select()的contains()和matchs()之间的区别

python - Sklearn kNN 使用用户定义的指标

Python XML 解析器不返回 XML 元素

python - 统一码编码错误 : 'charmap' codec can't encode character '\u010d'

python - 同步 Python 脚本？

python - Pandas 根据独特值进行分组和聚合