我正在尝试使用 Python 解析我的 csv 文件。每行有四个用逗号分隔的元素。每个元素都是一个字符串,但也可能包含逗号。如果元素包含逗号,则该元素用双引号引起来。以下示例显示了带引号和不带引号的两种不同情况:
http://data.europa.eu/esco/skill/CTC_43028,"use data extraction, transformation and loading tools","ETL|extract, transform, load","<div>Integrate information from multiple applications, created and maintained by various organisations, into one consistent and transparent data structure.</div>"
http://data.europa.eu/esco/skill/SCG.TS.1.4.m.2,support company plan,follow industry guidelines|follow organisation's vision|monitor policy implementation|support company mission,<div>Act within one's work role to advance the goals and vision of the organisation.</div>
我想要的是将每一行分成四个元素。 我尝试过Python的split函数,但没有成功。我想我必须使用正则表达式,但我不熟悉它。 您能给一些帮助吗? 非常感谢。
最佳答案
csv
模块就是你想要的:
import csv
with open('file.csv') as f:
r = csv.reader(f)
for row in r:
print row
['http...', 'transformation ...', 'ETL|ext ...', '<div>Integrate ...']
['http:...', 'support ...', 'follow ...', '<div>Act ...']
','
是默认分隔符,'"'
是默认引号字符。
关于python - 有条件分割字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42189614/