python - 按分隔符分割时保持引用 block 完整

标签 python python-3.x split

给定一个示例字符串 s = '嗨,我的名字是 Humpty-Dumpty,来自“爱丽丝,爱丽丝镜中奇遇记”',我想将其分成以下 block :

# To Do: something like {l = s.split(',')}
l = ['Hi', 'my name is Humpty-Dumpty', '"Alice, Through the Looking Glass"']

我不知道在哪里可以找到多少分隔符。

这是我最初的想法,它很长,而且不准确,因为它删除了所有分隔符,而我希望引号内的分隔符保留下来:

s = 'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'
ss = []
inner_string = ""
delimiter = ','

for item in s.split(delimiter):
    if not inner_string: 
        if '\"' not in item: # regullar string. not intersting
            ss.append(item)
        else:
            inner_string += item # start inner string

    elif inner_string:
        inner_string += item

        if '\"' in item:  # end inner string
            ss.append(inner_string)
            inner_string = ""
        else:            # middle of inner string
            pass

print(ss)
# prints ['Hi', ' my name is Humpty-Dumpty', ' from "Alice Through the Looking Glass"'] which is OK-ish

最佳答案

您可以使用 re.split 按正则表达式进行拆分:

>>> import re
>>> [x for x in re.split(r'([^",]*(?:"[^"]*"[^",]*)*)', s) if x not in (',','')]

s等于:

'Hi, my name is Humpty-Dumpty, from "Alice, Through the Looking Glass"'

它输出:

['Hi', ' my name is Humpty-Dumpty', ' from "Alice, Through the Looking Glass"']

正则表达式解释:

(
    [^",]*          zero or more chars other than " or ,
    (?:             non-capturing group
        "[^"]*"     quoted block
        [^",]*      followed by zero or more chars other than " or ,
    )*              zero or more times
)

关于python - 按分隔符分割时保持引用 block 完整,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53391766/

相关文章:

python - 连接两行,如果它们不为空

python - 如何使用 python 与 "pyreportjasper"从 "jrxml"(带数据库)转换为 "pdf"?

SQL Server 2016,无效的对象名称 'STRING_SPLIT'

Python语音识别麦克风不识别语音

python-3.x - 反转一系列数字的输出

python - 在新行、制表符和一些空格上拆分字符串

python - 如何在Python中将NetCDF文件中的数据信息(YYYYMMDD)拆分为YYYY MM DD?

python - Django UNIQUE 约束失败 : webapp_post. slug

Python 等同于系统 ('PAUSE' )

python 安装工具 : how can I install package with cython submodules?