python - 将字符串拆分为子字符串

我想将一个字符串拆分成它自己的代表每个字段的字符串:

name,city,points,score,cards

我有这些字符串:

Paul Grid - Hong Kong 56  663 0
Anna Grid - Tokyo 16  363 0
Greg H.Johs - Hong Kong -6  363 4
Jessy Holm Smith - Jakarta 8  261 0

格式为:

Name[SPACE]-[SPACE]City[SPACE]-Points[SPACE][SPACE]Score[SPACE]Cards

名字可以有空格和'.'在里面
城市中可以有空格
ex Score 和 Points 之间有时会有双空格
Scores, Points, Card 可以是负数

我想在 Python 中实现的规则如下:

Name : From beginning, until you see "-" - and then strip trailing space from that string.
Cards: From end and back, until you meet the first space
Score: From the space you hit when you made card, go back until next space.
Points: From the space you hit when you made Score, go back until next space.
City: where Name ended and where the Points stopped after seeing the space.

我的问题是我不能只将空格替换为分隔符，因为空格可以用于名称和城市，而“-”用于分隔名称和城市。

我可以用粗暴的方式执行此操作，逐个字符逐个执行，但想知道 Python 是否有智能的方式来执行此操作？

我的最终结果是将每一行分成多个字段，这样我就可以解决 ex scorerecord.name、scorerecord.city 等

最佳答案

使用 re.match() 函数和特定的正则表达式模式:

import re

data = '''Paul Grid - Hong Kong 56  663 0
Anna Grid - Tokyo 16  363 0
Greg H.Johs - Hong Kong -6  363 4
Jessy Holm Smith - Jakarta 8  261 0'''

data = data.split('\n')
pat = re.compile(r'(?P<name>[^-]+) +- *(?P<city>[^0-9]+) +(?P<points>-?[0-9]+) +'\
                   '(?P<score>[0-9]+) +(?P<cards>[0-9]+)')

result = [pat.match(s).groupdict() for s in data]

print(result)

输出:

[{'name': 'Paul Grid', 'city': 'Hong Kong', 'points': '56', 'score': '663', 'cards': '0'}, {'name': 'Anna Grid', 'city': 'Tokyo', 'points': '16', 'score': '363', 'cards': '0'}, {'name': 'Greg H.Johs', 'city': 'Hong Kong', 'points': '-6', 'score': '363', 'cards': '4'}, {'name': 'Jessy Holm Smith', 'city': 'Jakarta', 'points': '8', 'score': '261', 'cards': '0'}]

https://docs.python.org/3/library/re.html#re.match.groupdict

关于python - 将字符串拆分为子字符串，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50187369/

python - 将字符串拆分为子字符串

上一篇：python - 我怎样才能从元组列表和列表中分离出数据框的列

下一篇：python - 根据多个条件计算列