python - 使用Python逐行比较两个文本文件

我有两个要比较的文本文件。第一个文件包含唯一的项目，第二个文件包含相同的项目但重复多次。我想看看第二个文件中每一行重复了多少次。这是我写的:

import os
import sys

f1 = open('file1.txt')  # this has the 27 unique lines, 
f1data = f1.readlines()

f2 = open('file2.txt')  # this has lines repeated various times, with a total of 11162 lines
f2data = f2.readlines()

sys.stdout = open("linecount.txt", "w")


for line1 in f1data:
    linecount = 0
    for line2 in f2data:
        if line1 in line2:
        linecount+=1

    print line2, crime

问题是，当我将行数结果相加时，它返回 11586，而不是 11162。行数增加的原因是什么？

是否有另一种使用 Python 获取线路频率输出的方法？

最佳答案

https://docs.python.org/2.7/reference/expressions.html#in :

For the Unicode and string types, x in y is true if and only if x is a substring of y.

而不是

    if line1 in line2:

我想你是想写

    if line1 == line2:

<小时/>

或者可能替换整个

for line2 in f2data:
    if line1 in line2:
        linecount+=1

阻止

if line1 in f2data:
    linecount += 1

关于python - 使用Python逐行比较两个文本文件，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33594948/

上一篇：python - 图形工具:如何更改顶点颜色？

下一篇：python - Django 'File' 对象在管理中的更改表单上没有属性 '_size'

python - 如何将 HTML block (使用 python)插入当前工作的网站....？

mysql - 如何修复子查询以阻止 LEFT JOIN 增加计数？

sql - 在一个查询中选择两个计数

android - 如何防止EditText在标点符号后换行

java - 将 JSON 从 Python 发送到 Java

Python GUI (glade) 显示 shell 进程的输出

mysql - 使用时间戳和计数格式化 MySQL 查询

android - 去掉 TabWidget 下的那一行

java - 将从 XY 点创建的二维路径分成相等的 block Java