我有一个像这样的简单 CSV 文件:
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Note ,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32
1,,,,,X,,,,,,,,X,,,,,,,,X,,,,,,,,X,,,
2,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
我需要将其解析为包含 1
-s 的二维数组,其中 X 为 0
否则, 忽略标题/额外的行。
在阅读了关于csv
模块的文档后,我编写了一个简单的脚本,如下所示:
import csv
csvfile = open('input.csv', 'rb')
reader = csv.reader(csvfile,dialect='excel', delimiter=' ', quotechar='|')
data = []
rowCount = 0
for row in reader:
if(rowCount > 2): #skip first 3 rows (2 empty and 1 label)
dataRow = []
for i in xrange(1,len(row[0])):#skip 1st label column
dataRow.append(1 if row[0][i] == 'X' else 0) #append 1s for X, 0s otherwise
data.append(dataRow)
rowCount += 1
print data
这给了我预期的输出:
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]
对于
,,,,,X,,,,,,,,X,,,,,,,,X,,,,,,,,X,,,
三元条件可以写成 ord(row[0][i])/88
,但是是否可以将每个字符串行映射到整数行 1 和 0?
是否有更“pythonic”的编写方式?
最佳答案
你应该使用delimiter=','
:
reader = csv.reader(csvfile, dialect='excel', delimiter=',', quotechar='|')
实际上:
dialect='excel', delimiter=','
是默认值,quotechar='|'
不是您的示例文件所必需的(如果需要,请保留它)。所以这是更短的:
reader = csv.reader(csvfile)
去掉前三行:
[next(reader) for _ in range(3)]
读取所有行:
data = [[1 if entry=='X' else 0 for entry in row[1:]] for row in reader]
这相当于:
data = []
for row in reader:
data.append([1 if entry=='X' else 0 for entry in row[1:]])
当然,在缩进后自动关闭打开你的文件:
with open('input.csv', 'rb‘) as csvfile:
# Put the rest of the algorithm here.
# The file is closed automatically just because continuing detended.
这是 context manager 的主要示例.
综合起来:
import csv
with open('input.csv', 'rb') as csvfile:
reader = csv.reader(csvfile)
[next(reader) for _ in range(3)]
data = [[1 if entry=='X' else 0 for entry in row[1:]] for row in reader]
关于python - 如何解析一个简单的csv文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30274663/