python - 如何使用美汤获取谷歌财经某只股票的当前价格？

我有以下 python 代码，目标是获取这只股票的当前价格，即 110.80 美元。

import urlparse
import urllib2
import pdb
from bs4 import BeautifulSoup
from pprint import pprint

url = "https://www.google.com.hk/finance?q=0001&ei=yF14VYC4F4Wd0ASb64CoCw"

def WebCrawl(url):
    htmltext = urllib2.urlopen(url).read()
    soup = BeautifulSoup(htmltext)
    P = soup.find()
    print P

WebCrawl(url)

现在当我打印 soup , 数字 110.80 实际上出现在多个地方，例如:

{u:"/finance?q=HKG:0001",name:"0001",cp:"-1.07",p:"110.80",cid:"164573760542896"}

和

<span id="ref_164573760542896_l">110.80</span>

和

<meta content="110.80" itemprop="price"/>

第一个问题:在 html 文本中查找该股票当前价格的正确位置是什么，因为价格似乎出现在 html 文本中的多个区域？

第二个问题:我应该在soup.find()中放什么？或 soup.find_all()字段以便我可以获得该特定股票的当前价格。有人可以帮我吗？

最佳答案

find()将允许您在 HTML DOM 中查找标签。例如，如果你想要网站的标题，你可以这样做，bs.find("title")它将返回标题的第一个实例。 (如:<title>Some title here</title>)您还可以过滤具有特定属性的标签。很多网站都有大量的 div，但是如果你想要类类型为 red 的 div ，你可以这样做:bs.find('div', attrs={'class': 'red'}) .这将返回第一个 div具有类类型 red . Read the documentation for more detail.

对于这个例子，你可以做这样的事情来获取股票价格:

import urllib2
from bs4 import BeautifulSoup

url = "https://www.google.com.hk/finance?q=0001&ei=yF14VYC4F4Wd0ASb64CoCw"

def WebCrawl(url):
    htmltext = urllib2.urlopen(url).read()
    soup = BeautifulSoup(htmltext)
    p = soup.find("span", attrs={"id": "ref_164573760542896_l"}).text
    print p

WebCrawl(url)

对于meta标签你可以这样做:

p = soup.find("meta", attrs={"itemprop": "price"})
print p['content']

关于python - 如何使用美汤获取谷歌财经某只股票的当前价格？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30762459/

python - 如何使用美汤获取谷歌财经某只股票的当前价格？

上一篇：python - 使用 curl 进行 POST 获得与使用请求进行 POST 时不同的响应

下一篇：python - 了解如何使用 BeautifulSoup 进行网页抓取