我有一个大文件,需要将其加载到字符串列表中。每个元素将包含文本,直到紧跟在数字后面的“,”
例如:
this is some text, value 45789, followed by, 1245, and more text 78965, more random text 5252,
这应该变成:
["this is some text, value 45789", "followed by, 1245", "and more text 78965", "more random text 5252"]
我目前正在做re.sub(r'([0-9]+),','~', <input-string>)
然后分割“~”(因为我的文件不包含~),但这会抛出逗号之前的数字..有什么想法吗?
最佳答案
您可以使用re.split
与 positive look-behind assertion :
>>> import re
>>>
>>> text = 'this is some text, value 45789, followed by, 1245, and more text 78965, more random text 5252,'
>>> re.split(r'(?<=\d),', text)
['this is some text, value 45789',
' followed by, 1245',
' and more text 78965',
' more random text 5252',
'']
关于Python正则表达式根据数字后面的逗号进行分割,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34668217/