python - 错误的请求biopython出了什么问题

标签 python bioinformatics biopython

嘿,我的脚本出了什么问题?它响应错误的请求。我不知道出了什么问题

from Bio import Entrez
Entrez.email = 'matro@gmail.com'
import time



  def fetch(ID):
        handle = Entrez.efetch(db = 'Protein', id = ID, retmode = 'fasta', rettype = 'text') #<--- here 
        seq = handle.read()
        time.sleep(1)
        return seq
  ids = ['ATK1','Cat','Lig1']
  out = [fetch(id) for id in ids] 
  with open('out.fasta', 'w') as f:
      f.writelines(out)

引文:

File "<ipython-input-42-0be173f176eb>", line 1, in <module>
runfile('C:/Users/MGrad/bioPythonSearch.py', wdir='C:/Users/MGrad/Dropbox/Leg')

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)

File "C:\Users\Local\conda\conda\envs\my_root\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "C:/Users/Leg/bioPythonSearch.py", line 20, in <module>
out = [fetch(id) for id in ids] # where ids is a Python list containing gene ids/accession numbers

File "C:/Users/MGrad/bioPythonSearch.py", line 20, in <listcomp>
out = [fetch(id) for id in ids] # where ids is a Python list containing gene ids/accession numbers

File "C:/Users/MGrad/bioPythonSearch.py", line 14, in fetch
handle = Entrez.efetch(db = 'Protein', id = ID, retmode = 'fasta', rettype = 'text')

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\site-packages\Bio\Entrez\__init__.py", line 180, in efetch
return _open(cgi, variables, post=post)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\site-packages\Bio\Entrez\__init__.py", line 526, in _open
raise exception

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\site-packages\Bio\Entrez\__init__.py", line 524, in _open
handle = _urlopen(cgi)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 223, in urlopen
return opener.open(url, data, timeout)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 532, in open
response = meth(req, response)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 570, in error
return self._call_chain(*args)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 504, in _call_chain
result = func(*args)

File "C:\Users\MGrad\AppData\Local\conda\conda\envs\my_root\lib\urllib\request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: Bad Request

最佳答案

Entrez.efetch() 适用于精确的 ID 号。如果您想查找像 ATK1 这样的术语,您需要首先通过 Entrez.esearch() 将其解析为一个或多个 ID 号。这是一个简单但有效的示例:

import time
from Bio import SeqIO
from Bio import Entrez

import xml.etree.cElementTree as ElementTree

TERMS = ['ATK1', 'Cat', 'Lig1']

Entrez.email = 'matro@gmail.com'

def fetch(term):
    # retmax=1 just returns first result of possibly many;  i.e. may be wrong, use more specific ID
    handle = Entrez.esearch(db='Protein', term=term, retmax=1)
    root = ElementTree.fromstring(handle.read())

    id_number = root.find("IdList/Id").text

    print(term, '->', id_number)  # ATK1 -> 1039008188

    handle = Entrez.efetch(db='Protein', id=id_number, retmode='text', rettype='fasta')
    seq_record = SeqIO.read(handle, 'fasta')

    time.sleep(1)
    return seq_record

out = [fetch(my_term) for my_term in TERMS]

with open('out.fasta', 'w') as f:
    for record in out:
        SeqIO.write(record, f, 'fasta')

Entrez.esearch() 结果以 XML 文档的形式返回,因此我们使用 cElementTree 来解析它。此查询有多个结果,但我们天真地只要求一个 - 您需要通过检查多个结果或提供更具体的术语来解决这个问题。

此外,您的代码还颠倒了 retmoderettype 的值。

关于python - 错误的请求biopython出了什么问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45333658/

相关文章:

python - 使用 Biopython Entrez 从 fasta 记录访问序列元素

python - 如何使用 Pandas 获取单元格的值并存储到变量中?

python - 是否可以即时升级 Python 包?

python - 如何使用 openCV python 读取 Youtube 直播?

r - 使用 R 绘制分类数据

python - 使用 python 从 pdb 文件中删除部分

Python启动进程完全独立于启动进程

python - 在snakemake规则输出中使用配置和通配符

python - BioPython,如何从 .fasta 转换为 .aln 以进行簇比对?

python - 使用biopython解析fasta文件来计算属于每个ID的序列读取数