我有一个非常大的文件,像这样:
[PATTERN1] line1 line2 line3 ... ... [END PATTERN] [PATTERN2] line1 line2 ... ... [END PATTERN]
I need to extract in another file, lines between a variable starter pattern [PATTERN1] and another define pattern [END PATTERN], only for some specific starter pattern.
For example:
[PATTERN2] line1 line2 ... ... [END PATTERN]
I already do the same thing, with a smaller file, using this code:
FILE=open('myfile').readlines()
newfile=[]
for n in name_list:
A = FILE[[s for s,name in enumerate(FILE) if n in name][0]:]
B = A[:[e+1 for e,end in enumerate(A) if 'END PATTERN' in end][0]]
newfile.append(B)
其中“name_list”是包含我需要的特定起始模式的列表。
有效!!但我想有一种更好的方法可以在不使用 .readlines() 命令的情况下处理大文件。
任何人都可以帮助我吗?
非常感谢!
最佳答案
考虑:
# hi
# there
# begin
# need
# this
# stuff
# end
# skip
# this
with open(__file__) as fp:
for line in iter(fp.readline, '# begin\n'):
pass
for line in iter(fp.readline, '# end\n'):
print line
打印“需要这个东西”
更灵活(例如允许重新模式匹配)是使用 itertools drop- 和 takewhile:
with open(__file__) as fp:
result = list(itertools.takewhile(lambda x: 'end' not in x,
itertools.dropwhile(lambda x: 'begin' not in x, fp)))
关于python - 如何使用 python grep 大文件中两个模式之间的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11156259/