python - 谷歌搜索爬虫，Python

我是 Python 新手，试图制作一个 Google 搜索抓取工具来获取股票价格，但我运行下面的代码，我没有得到任何结果，而是得到了页面 HTML 格式。

import urllib.request
from bs4 import BeautifulSoup

import requests

url = 'https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uwti'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html, "html.parser")

print(soup.prettify())

我是否遗漏了一些非常简单的东西，请给我一些指示。我正在尝试提取当前的股票值(value)。如何提取所附图像中的该值？

最佳答案

当您右键单击并在浏览器中选择“查看源代码”时，它位于源代码中。您只需稍微更改url并传递用户代理以匹配您使用请求在其中看到的内容:

In [2]: from bs4 import BeautifulSoup
   ...: import requests
   ...: 
   ...: url = 'https://www.google.com/search?q=uwti&rct=j'
   ...: response = requests.get(url, headers={
   ...:     "user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (K
   ...: HTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36"})
   ...: html = response.content
   ...: 
   ...: soup = BeautifulSoup(html, "html.parser")
   ...: print(soup.select_one("span._Rnb.fmob_pr.fac-l").text)
   ...: 
27.51

soup.find("span", class_="_Rnb fmob_pr fac-l").text 也可以工作，并且是使用 CSS 类查找标签的正确方法与 find 或 find_all

在chrome中使用 https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=uwti 时可以看到，有一个重定向到 https://www.google.com/search?q=uwti&rct=j :

关于python - 谷歌搜索爬虫，Python，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/40064588/

python - 谷歌搜索爬虫，Python

上一篇：python - 在 Python 中处理字符串 : Formatting the information in a very specific way

下一篇：python - XOR 非常大的列表及其旋转