我想将一个字符串拆分成它自己的代表每个字段的字符串:
name,city,points,score,cards
我有这些字符串:
Paul Grid - Hong Kong 56 663 0
Anna Grid - Tokyo 16 363 0
Greg H.Johs - Hong Kong -6 363 4
Jessy Holm Smith - Jakarta 8 261 0
格式为:
Name[SPACE]-[SPACE]City[SPACE]-Points[SPACE][SPACE]Score[SPACE]Cards
- 名字可以有空格和'.'在里面
- 城市中可以有空格
- ex Score 和 Points 之间有时会有双空格
- Scores, Points, Card 可以是负数
我想在 Python 中实现的规则如下:
Name : From beginning, until you see "-" - and then strip trailing space from that string.
Cards: From end and back, until you meet the first space
Score: From the space you hit when you made card, go back until next space.
Points: From the space you hit when you made Score, go back until next space.
City: where Name ended and where the Points stopped after seeing the space.
我的问题是我不能只将空格替换为分隔符,因为空格可以用于名称和城市,而“-”用于分隔名称和城市。
我可以用粗暴的方式执行此操作,逐个字符逐个执行,但想知道 Python 是否有智能的方式来执行此操作?
我的最终结果是将每一行分成多个字段,这样我就可以解决 ex scorerecord.name、scorerecord.city 等
最佳答案
使用 re.match()
函数和特定的正则表达式模式:
import re
data = '''Paul Grid - Hong Kong 56 663 0
Anna Grid - Tokyo 16 363 0
Greg H.Johs - Hong Kong -6 363 4
Jessy Holm Smith - Jakarta 8 261 0'''
data = data.split('\n')
pat = re.compile(r'(?P<name>[^-]+) +- *(?P<city>[^0-9]+) +(?P<points>-?[0-9]+) +'\
'(?P<score>[0-9]+) +(?P<cards>[0-9]+)')
result = [pat.match(s).groupdict() for s in data]
print(result)
输出:
[{'name': 'Paul Grid', 'city': 'Hong Kong', 'points': '56', 'score': '663', 'cards': '0'}, {'name': 'Anna Grid', 'city': 'Tokyo', 'points': '16', 'score': '363', 'cards': '0'}, {'name': 'Greg H.Johs', 'city': 'Hong Kong', 'points': '-6', 'score': '363', 'cards': '4'}, {'name': 'Jessy Holm Smith', 'city': 'Jakarta', 'points': '8', 'score': '261', 'cards': '0'}]
https://docs.python.org/3/library/re.html#re.match.groupdict
关于python - 将字符串拆分为子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50187369/