python - 从国家漏洞数据库中抓取数据 : can't figure out clicking on a button (Mechanize+Python)

标签 python mechanize

我正在尝试从国家漏洞数据库 (http://web.nvd.nist.gov) 中抓取一些数据。我想要做的是输入一个搜索词,它会给我带来前 20 个结果,抓取这些数据。然后我想单击“下一个 20”,直到遍历所有结果。

我能够成功提交搜索词,但单击“下一个 20”根本不起作用。

我使用的工具是 Python + Mechanize

这是我的代码:

# Browser
b = mechanize.Browser()

# The URL to this service
URL = 'http://web.nvd.nist.gov/view/vuln/search'
Search = ['Linux', 'Mac OS X', 'Windows']

def searchDB():
    SearchCounter=0
    for i in Search:
        # Load the page
        read = b.open(URL)
        # Select the form
        b.select_form(nr=0)
        # Fill out the search form
        b['vulnSearchForm:text'] = Search[int(SearchCounter)] 
        b.submit('vulnSearchForm:j_id120')
        result=b.response().read()
        file=open(Search[SearchCounter]+".txt","w")
        file.write(result)

        '''Here is where the problem is. vulnResultsForm:j_id116 is value of the "next 20 button'''
        b.select_form(nr = 0)
        b.form.click('vulnResultsForm:j_id116')
        result=b.response().read()

if __name__ == '__main__':
    searchDB()

最佳答案

来自 b.form.click 的文档字符串:

Return request that would result from clicking on a control.

The request object is a urllib2.Request instance, which you can pass to urllib2.urlopen (or ClientCookie.urlopen).



所以:
request = b.form.click('vulnResultsForm:j_id116')
b.open(request)
result = b.response().read()

关于python - 从国家漏洞数据库中抓取数据 : can't figure out clicking on a button (Mechanize+Python),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4396774/

相关文章:

python - 如何从以键为列和行索引的字典构造pandas DataFrame

python - 如何在 Pandas 中按每组两列计算唯一记录?

python - 如何将 INI 文件转换为 CSV 文件

python - 如何在python中进行网络抓取

perl - Mechanize::Firefox 卡住了

python - WIPO 搜索缺少表格数据 POST 消息

python - Sentry 、乌鸦和 Django celery

python - 如何重命名seaborn散点图图例中的标签和标题?

python - 将目录添加到 sys.path/PYTHONPATH

python - WebDriverException : Service . ..\firefox.exe 通过 Selenium 使用 GeckDriver Firefox 意外退出错误