我有一个包含一列“文本”的数据框:
text
I love cakes we should make them
Joe is very late will there be photography?
you should wright code correctly it is very important
如果文本之间有 2 个或更多空格,我想分解这些行。所以期望的输出是:
text
I love cakes
we should make them
Joe is very late
will there be photography?
you should wright code correctly
it is very important
我知道我可以这样做: df["text"].apply(lambda x: x.split(""))
但我不想在 split 中指定每个数量空格 (df["text"].apply(lambda x: x.split("")), df["text"].apply(lambda x: x.split("")), df["text"].apply(lambda x: x.split("")), .....
。我想要 2 个以上的空格条件。我该怎么做?
最佳答案
您可以通过正则表达式进行拆分,然后分解
列
df = df['text'].str.split(r'\s{2,}').explode().reset_index().drop("index", 1)
输出
text
0 I love cakes
1 we should make them
2 Joe is very late
3 will there be photography?
4 you should wright code correctly
5 it is very important
关于python - 鉴于文本之间有很多空格,如何将文本行拆分为多行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/74025257/