给定一个示例字符串 s = '嗨,我的名字是 Humpty-Dumpty,来自“爱丽丝,爱丽丝镜中奇遇记”'
,我想将其分成以下 block :
# To Do: something like {l = s.split(',')}
l = ['Hi', 'my name is Humpty-Dumpty', '"Alice, Through the Looking Glass"']
我不知道在哪里可以找到多少分隔符。
这是我最初的想法,它很长,而且不准确,因为它删除了所有分隔符,而我希望引号内的分隔符保留下来:
s = 'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'
ss = []
inner_string = ""
delimiter = ','
for item in s.split(delimiter):
if not inner_string:
if '\"' not in item: # regullar string. not intersting
ss.append(item)
else:
inner_string += item # start inner string
elif inner_string:
inner_string += item
if '\"' in item: # end inner string
ss.append(inner_string)
inner_string = ""
else: # middle of inner string
pass
print(ss)
# prints ['Hi', ' my name is Humpty-Dumpty', ' from "Alice Through the Looking Glass"'] which is OK-ish
最佳答案
您可以使用 re.split
按正则表达式进行拆分:
>>> import re
>>> [x for x in re.split(r'([^",]*(?:"[^"]*"[^",]*)*)', s) if x not in (',','')]
当s
等于:
'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'
它输出:
['Hi', ' my name is Humpty-Dumpty', ' from "Alice, Through the Looking Glass"']
正则表达式解释:
(
[^",]* zero or more chars other than " or ,
(?: non-capturing group
"[^"]*" quoted block
[^",]* followed by zero or more chars other than " or ,
)* zero or more times
)
关于python - 按分隔符分割时保持引用 block 完整,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53391766/