python - 读取函数外部的文件以进行迭代

我创建了一个函数，我想运行整个文件，但遇到了一些麻烦。我只从文件的最后一行获取输出。

我有两个不同的输入文件，其想法是从一个文件中获取行并收集某些术语，将它们添加到字典中，然后在第二个文件中搜索相应的行并打印输出。我知道问题很可能是我对该函数的调用的位置。

矩阵文件如下所示

        Sp_ds   Sp_hs   Sp_log  Sp_plat 
c3833_g1_i2     4.00    0.07    16.84   26.37 
c4832_g1_i1     24.55   116.87  220.53  28.82 
c5161_g1_i1     107.49  89.39   26.95   698.97 
c4399_g1_i2     27.91   72.57   5.56    36.58 
c5916_g1_i1     82.57   19.03   48.55   258.22

Blast 文件如下所示

c0_g1_i1|m.1    gi|74665200|sp|Q9HGP0.1|PVG4_SCHPO      100.00  372     0       0       1       372     1       372     0.0       754 
c1000_g1_i1|m.799       gi|48474761|sp|O94288.1|NOC3_SCHPO      100.00  747     0       0       5       751     1       747     0.0      1506 
c1001_g1_i1|m.800       gi|259016383|sp|O42919.3|RT26A_SCHPO    100.00  268     0       0       1       268     1       268     0.0       557 
c1002_g1_i1|m.801       gi|1723464|sp|Q10302.1|YD49_SCHPO       100.00  646     0       0       1       646     1       646     0.0      1310 
c1003_g1_i1|m.803       gi|74631197|sp|Q6BDR8.1|NSE4_SCHPO      100.00  246     0       0       1       246     1       246     1e-179    502 
c1004_g1_i1|m.804       gi|74676184|sp|O94325.1|PEX5_SCHPO      100.00  598     0       0       1       598     1       598     0.0      1227 
c1005_g1_i1|m.805       gi|9910811|sp|O42832.2|SPB1_SCHPO       100.00  802     0       0       1       802     1       802     0.0      1644 
c1006_g1_i1|m.806       gi|74627042|sp|O94631.1|MRM1_SCHPO      100.00  255     0       0       1       255     47      301     0.0       525 
c1007_g1_i1|m.807       gi|20137702|sp|O74370.1|ISY1_SCHPO      100.00  201     0       0       1       201     1       201     4e-146    412

到目前为止我得到的程序是这个

def parse_blast(blast_line="NA"):
    transcript = blast_line[0][0]
    swissProt = blast_line[1][3]
    return(transcript, swissProt)

blast = open("/scratch/RNASeq/blastp.outfmt6")
for line in blast:
      line= [item.split('|') for item in line.split()]
      (transcript, swissProt) = parse_blast(blast_line = line)


transcript_to_protein = {}
transcript_to_protein[transcript] = swissProt
if transcript in transcript_to_protein:
        protein = transcript_to_protein.get(transcript)

matrix = open("/scratch/RNASeq/diffExpr.P1e-3_C2.matrix")
for line in matrix:
      matrixFields = line.rstrip("\n").split("\t")
      transcript = matrixFields[0]
      Sp_ds = matrixFields[1]
      Sp_hs = matrixFields[2]
      Sp_log = matrixFields[3]
      Sp_plat = matrixFields[4]

tab = "\t"
fields = (protein,Sp_ds,Sp_hs,Sp_log,Sp_plat)
out = open("parsed_blast.txt","w")
out.write(tab.join(fields))
matrix.close()
blast.close()
out.close()

最佳答案

这是范围问题，因为您的缩进不正确。

for line in blast:
  line= [item.split('|') for item in line.split()]
  (transcript, swissProt) = parse_blast(blast_line = line)

所以你继续循环直到最后一行而不保存你得到的值。我认为你应该将缩进更改为此

transcript_to_protein = {} # 1. declare the dictionary

for line in blast:
      line= [item.split('|') for item in line.split()]
      (transcript, swissProt) = parse_blast(blast_line = line)
      transcript_to_protein[transcript] = swissProt # 2. Add the data to the dictionary

这将解决您的第一个文件的问题。但不会解决您的第二个文件的问题，因为您不在循环内使用字典。

所以你必须将这些行移动到第二个循环内

if transcript in transcript_to_protein:
    protein = transcript_to_protein.get(transcript)

我想你已经明白了。我将剩下的事情留给您来做，有几行需要在循环之前移动，一两行需要在第二个循环内移动。

关于python - 读取函数外部的文件以进行迭代，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40700348/

python - 读取函数外部的文件以进行迭代

上一篇：python - Python 中的 bool 运算(文档中的符号)

下一篇：python - 如何在 Python 中重用代码来跟踪三角板上棋子的运行