有一个包含文件的目录:
ab_list
bd_list
cd_list
mno_list
hk_list
pd_list
我有另一个名为 testfile 的文件,位于此目录之外:
abc
que nw
ab_list ON 8
gs_list ON 9
hk_list OFF 9
bd_list ON 7
cd_list OFF 6
fr_list ON 5
mno_list ON 4
pq_list OFF 6
jk_list ON 7
pd_list OFF 8
我想比较这两个文件以及所有文件名和旁边的 ON 的文件(如果匹配),它们的内容应该合并到一个名为 merge_file 的新文件中。与 testfile 匹配但具有 OFF 的其他文件,它们的文件名应打印在 new_file 中。 ab_list bd_list 和 mno_list 的内容应合并到 top_file 中 输出应该是这样的 新文件:
cd_list OFF no.of lines in file
pd_list OFF no.of lines in file
hk_list OFF no. of lines in file
merge_file (this has all ON merged) no.of lines in file
这是到目前为止的代码:
from pathlib import Path
with open('testfile') as fp:
data = dict([tuple(line.split())for line in fp if line.strip()])
with open('merge_file', 'w') as merge_file, open('match_file', 'w') as match_file:
lines = 0
for fp in Path(r'./test').glob('*_list'):
if fp.name in data:
if data[fp.name] == 'ON':
content = fp.open().readlines()
lines += len(content)
merge_file.write('\n'.join(content) + '\n')
else:
content = fp.open().readlines()
match_file.write(fp.name + ' OFF {}\n'.format(len(content)))
match_file.write('merge_file (this has all ON merged) {}'.format(lines))
我想从第一行读取,但它给出了一个名为“索引错误:列表超出范围”的错误。当前代码从第 4 行读取。
最佳答案
假设目录名称为 Folder
,并且该目录中有另一个名为 folder
的目录,此代码将执行以下操作:
from glob import glob
test_file_directory = "C:\\Users\\User\\Desktop\\Folder\\"
files1 = glob("*.txt")
with open(test_file_directory+"testfile.txt","r") as f:
files2 = [' '.join([l.split()[0],l.split()[1]]) for l in f.readlines()[3:]]
for f1 in files1:
for f2 in files2:
if f1[:-4]+' ON' == f2:
#print('match')
with open('merge_file.txt','a') as a:
with open(f1,'r') as r:
a.write(r.read()+'\n')
elif f1[:-4]+' OFF' == f2:
#print('match')
with open('match_file.txt','a') as a:
with open(f1,'r') as r:
a.write(f"{f2} {len(r.readlines())}\n")
关于python - 合并和比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62389332/