python - Python 中基于分隔符分割和组合文本

标签 python string list text

我有一个列表列表,其中包含字符串。经过多种正则表达式的工作后,我将要用作分隔符的 @@@ 插入到我的字符串中:

[['@@@this is part one and here is part two and here is part three and heres more and heres more'],
 ['this is part one@@@and here is part two and here is part three and heres more and heres more'],
 ['this is part one and here is part two@@@and here is part three and heres more and heres more']
 ['this is part one and here is part two and here is part three@@@and heres more and heres more']
 ['this is part one and here is part two and here is part three and heres more@@@and heres more']]

现在,我需要想出这个:

[['this is part one'],['and here is part two'],['and here is part three'], ['and heres more'], ['and heres more']]  

到目前为止,我的尝试都是臃肿、老套且丑陋的。我发现自己在 split 、组合和匹配。谁能推荐一些关于此类问题的一般建议,以及使用哪些工具来使其易于管理?

编辑请注意! and Heres more 确实在理想输出中出现了两次!

最佳答案

我认为您实际上需要抓取 @@@ 之后直到下一个 and 或字符串结尾的所有字符。

>>> [[m] for x in l for m in re.findall(r'@@@(.*?)(?=\sand\b|$)', x[0])]
[['this is part one'], ['and here is part two'], ['and here is part three'], ['and heres more'], ['and heres more']]

关于python - Python 中基于分隔符分割和组合文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29000435/

相关文章:

android - Realm 将项目添加到添加到 RealmObject 的 RealmList

python - 如何在 Python 中计算多列 CSV 文件中日期之间的平均时间?

python - 在Python中编写PATH时如何避免由 '\'字符引起的问题

javascript - 将元素列表传递给循环以显示内容时,但id始终返回最后一个元素

python - 如何在循环中使用 Pandas 字符串包含(str.contain)?

string - 粘贴的替代功能

通用数组的 C# 扩展方法

未知方向的单个字符的 Python 光学字符识别 (OCR)

python - __sizeof__ 没有被 sys.getsizeof 调用

java - 即使没有重复的单词,如何从字符串中删除所有重复的单词?