我正在使用Python 2.7。我有两个 tsv 数据文件,我将它们读入两个字典,我想计算它们的 recall
分数,因此我需要计算 tp
和 fn
。
我的字典是这样的:
gold = {'A11':'cat', 'A22':'cat', 'B3':'mouse'}
results = {'A2':'cat', 'B2':'dog'}
我的代码主要迭代gold
字典并删除gold
字典key
末尾的数字以及结果
键
。然后,检查键是否匹配,以查找它们的值是否匹配,以计算 tp
。但是,我的代码似乎总是增加 fn
。这是我的可运行代码:
from __future__ import division
import string
def eval():
tp=0 #true positives
fn=0 #false negatives
fp=0#false positives
gold = {'A11':'cat', 'A22':'cat', 'B3':'mouse'}
results = {'A2':'cat', 'B2':'dog'}
#iterate gold dictionary
for i,j in gold.items():
#remove the digits off gold keys
i_stripped = i.rstrip(string.digits)
#iterate results dictionary
for k,v in results.items():
#remove the digits off results keys
k_stripped = k.rstrip(string.digits)
# check if key match!
if i_stripped == k_stripped:
#check if values match then increment tp
if j == v:
tp += 1
#delete dictionary entries to avoid counting them again
del gold_copy[i]
del results_copy[k]
#get out of this loop we found a match!
break
continue
# NO match was found in the results, then consider it as fn
fn += 1 #<------ wrong calculations caused in this line
print 'tp = %.2f fn = %.2f recall = %.2f ' % (tp, fn, float(tp)/(tp+fn))
这是输出:
tp = 1.00 fn = 3.00 recall = 0.25
fn
不正确,应该是 2
而不是 3
。如何阻止 fn
在每次迭代中递增?任何指导都将受到真正的赞赏。
谢谢,
最佳答案
在我看来,只有在结果中没有找到匹配项时,您才想增加 fn
。您可以使用变量来跟踪是否已找到匹配项,并根据该变量递增 fn
。下面我调整了您的代码并使用 match_found
来实现此目的。
#iterate gold dictionary
for i,j in gold.items():
# create a variable that indicates whether a match was found
match_found = False
#remove the digits off gold keys
i_stripped = i.rstrip(string.digits)
#iterate results dictionary
for k,v in results.items():
#remove the digits off results keys
k_stripped = k.rstrip(string.digits)
# check if key match!
if i_stripped == k_stripped:
#check if values match then increment tp
if j == v:
tp += 1
# now a match has been found, change variable
match_found = True
#delete dictionary entries to avoid counting them again
del gold_copy[i]
del results_copy[k]
#get out of this loop we found a match!
break
continue
# NO match was found in the results, then consider it as fn
# now, only if no match has been found, increment fn
if not match_found :
fn += 1 #<------ wrong calculations caused in this line
关于python - 阻止值在每次字典迭代中递增,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52425055/