python - 用BeautifulSoup抓取: object has no attribute

标签 python web-scraping beautifulsoup python-requests

我想使用 BeautifulSoup 从网站中提取公司名称和地址等数据摘录。然而,我遇到了以下失败:

Calgary's Notary Public 
Traceback (most recent call last):
  File "test.py", line 16, in <module>
    print item.find_all(class_='jsMapBubbleAddress').text
AttributeError: 'ResultSet' object has no attribute 'text'

HTML 代码片段在这里。我想提取所有文本信息并转换为 CSV 文件。请任何人帮助我。

<div class="listing__right article hasIcon">
   <h3 class="listing__name jsMapBubbleName" itemprop="name"><a data-analytics='{"lk_listing_id":"100971374","lk_non-ad-rollup":"0","lk_page_num":"1","lk_pos":"in_listing","lk_proximity":"14.5","lk_directory_heading":[{"085100":[{"00910600":"1"},{"00911000":"1"}]}],"lk_geo_tier":"in","lk_area":"left_1","lk_relevancy":"1","lk_name":"busname","lk_pos_num":"1","lk_se_id":"e292d1d2-f130-463d-8f0c-7dd66800dead_Tm90YXJ5_Q2FsZ2FyeSwgQUI_56","lk_ev":"link","lk_product":"l2"}' href="/bus/Alberta/Calgary/Calgary-s-Notary-Public/100971374.html?what=Notary&amp;where=Calgary%2C+AB&amp;useContext=true" title="See detailed information for Calgary's Notary Public">Calgary's Notary Public</a> </h3>
   <div class="listing__address address mainLocal">
      <em class="itemCounter">1</em>
      <span class="listing__address--full" itemprop="address" itemscope="" itemtype="http://schema.org/PostalAddress">
      <span class="jsMapBubbleAddress" itemprop="streetAddress">340-600 Crowfoot Cres NW</span>, <span class="jsMapBubbleAddress" itemprop="addressLocality">Calgary</span>, <span class="jsMapBubbleAddress" itemprop="addressRegion">AB</span> <span class="jsMapBubbleAddress" itemprop="postalCode">T3G 0B4</span></span>
      <a class="listing__direction" data-analytics='{"lk_listing_id":"100971374","lk_non-ad-rollup":"0","lk_page_num":"1","lk_pos":"in_listing","lk_proximity":"14.5","lk_directory_heading":[{"085100":[{"00910600":"1"},{"00911000":"1"}]}],"lk_geo_tier":"in","lk_area":"left_1a","lk_relevancy":"1","lk_name":"directions-step1","lk_pos_num":"1","lk_se_id":"e292d1d2-f130-463d-8f0c-7dd66800dead_Tm90YXJ5_Q2FsZ2FyeSwgQUI_56","lk_ev":"link","lk_product":"l2"}' href="/merchant/directions/100971374?what=Notary&amp;where=Calgary%2C+AB&amp;useContext=true" rel="nofollow" title="Get direction to Calgary's Notary Public">Get directions »</a>
   </div>
   <div class="listing__details">
      <p class="listing__details__teaser" itemprop="description">We  offer you a convenient, quick and affordable solution for your Notary Public or Commissioner for Oaths in Calgary needs.</p>
   </div>
   <div class="listing__ratings--root">
      <div class="listing__ratings ratingWarp" itemprop="aggregateRating" itemscope="" itemtype="http://schema.org/AggregateRating">
         <meta content="5" itemprop="ratingValue"/>
         <meta content="1" itemprop="ratingCount"/>
         <span class="ypStars" data-analytics-group="stars" data-clicksent="false" data-rating="rating5" title="Ratings: 5 out of 5 stars">
         <span class="star1" data-analytics-name="stars" data-label="Optional : Why did you hate it?" title="I hated it"></span>
         <span class="star2" data-analytics-name="stars" data-label="Optional : Why didn't you like it?" title="I didn't like it"></span>
         <span class="star3" data-analytics-name="stars" data-label="Optional : Why did you like it?" title="I liked it"></span>
         <span class="star4" data-analytics-name="stars" data-label="Optional : Why did you really like it?" title="I really liked it"></span>
         <span class="star5" data-analytics-name="stars" data-label="Optional : Why did you love it?" title="I loved it"></span>
         </span><a class="listing__ratings__count" data-analytics='{"lk_listing_id":"100971374","lk_non-ad-rollup":"0","lk_page_num":"1","lk_pos":"in_listing","lk_proximity":"14.5","lk_directory_heading":[{"085100":[{"00910600":"1"},{"00911000":"1"}]}],"lk_geo_tier":"in","lk_area":"left_1","lk_relevancy":"1","lk_name":"read_yp_reviews","lk_pos_num":"1","lk_se_id":"e292d1d2-f130-463d-8f0c-7dd66800dead_Tm90YXJ5_Q2FsZ2FyeSwgQUI_56","lk_ev":"link","lk_product":"l2"}' href="/bus/Alberta/Calgary/Calgary-s-Notary-Public/100971374.html?what=Notary&amp;where=Calgary%2C+AB&amp;useContext=true#ypgReviewsHeader" rel="nofollow" title="1 of Review for Calgary's Notary Public">1<span class="hidden-phone"> YP review</span></a>
      </div>
   </div>
   <div class="listing__details detailsWrap">
      <ul>
         <li><a href="/search/si/1/Notaries/Calgary%2C+AB" title="Notaries">Notaries</a>
            ,
         </li>
         <li><a href="/search/si/1/Notaries+Public/Calgary%2C+AB" title="Notaries Public">Notaries Public</a></li>
      </ul>
   </div>
</div>

有许多 div 带有 listing__rightarticle hasIcon。我正在使用 for 循环来提取信息。

我到目前为止编写的Python代码是。

import requests
from bs4 import BeautifulSoup

url = 'http://www.yellowpages.ca/search/si-rat/1/Notary/Calgary%2C+AB'
response = requests.get(url)
content = response.content

soup = BeautifulSoup(content)
g_data=soup.find_all('div', attrs={'class': 'listing__right article  hasIcon'})

for item in g_data:
    print item.find('h3').text
    #print item.contents[2].find_all('em', attrs={'class': 'itemCounter'})[1].text
    print item.find_all(class_='jsMapBubbleAddress').text

最佳答案

find_all 返回一个没有“text”属性的列表,因此您收到错误,不确定您要查找什么输出,但此代码似乎可以正常工作:

import requests
from bs4 import BeautifulSoup

url = 'http://www.yellowpages.ca/search/si-rat/1/Notary/Calgary%2C+AB'
response = requests.get(url)
content = response.content

soup = BeautifulSoup(content,"lxml")
g_data=soup.find_all('div', attrs={'class': 'listing__right article  hasIcon'})

for item in g_data:
    print item.find('h3').text
    #print item.contents[2].find_all('em', attrs={'class': 'itemCounter'})[1].text
    items = item.find_all(class_='jsMapBubbleAddress')
    for item in items:
        print item.text

关于python - 用BeautifulSoup抓取: object has no attribute,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35967854/

相关文章:

python - 随着 python/django/tastypie 的使用,apache 内存逐渐增加

python - 尝试从华为调制解调器读取消息时出现错误 125002

python - 如何将文件复制到 Python 脚本中的特定文件夹?

python - 使用 Python 抓取 Google 网页时,总是得不到足够的图像和重复的图像?

python - 如何用python和beautifulsoup解析html表格并写入csv

python - mysqlconnector python错误: "Commands out of sync; you can' t run this command now"

ios - 检索在应用程序商店或 iTunes 商店中按给定术语搜索结果的应用程序列表

python - 试图在一个 div 中抓取一个 div 中的元素,无法弄清楚

python - HTML 表格与 python 美丽汤

python - 为什么replace()、re.sub() 或strip() 不能处理这个字符串?