python - beautifulsoup select 方法返回回溯

标签 python beautifulsoup python-requests

嗯,我仍在学习 beautifulsoup 模块,我从书中复制了这个,用 python 自动化了无聊的东西,我尝试复制获取亚马逊价格脚本,但我得到了 .select() 方法的回溯 错误“TypeError:‘NoneType’对象不可调用” 这个错误让我很沮丧,因为我找不到太多关于它的信息

import bs4
import requests


header = {'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}

def site(url):

    x = requests.get(url, headers=header)
    x.raise_for_status()
    soup = bs4.BeautifulSoup(x.text, "html.parser")
    p = soup.Select('#buyNewSection > a > h5 > div > div.a-column.a-span8.a-text-right.a-span-last > div > span.a-size-medium.a-color-price.offer-price.a-text-normal')
    abc = p[0].text.strip()
    return abc

price = site('https://www.amazon.com/Automate-Boring-Stuff-Python-Programming/dp/1593275994')
print('price is' + str(price))

它必须返回包含价格的列表值,但我遇到了这个错误

最佳答案

如果您使用soup.selectsoup.Select 相反,你的代码确实有效,它只是返回一个空列表。原因可以看看我们是否检查你正在使用的函数:

help(soup.Select)

Out[1]:
Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.

比较:

help(soup.select)

Out[2]:
Help on method select in module bs4.element:

select(selector, namespaces=None, limit=None, **kwargs) method of bs4.BeautifulSoup instance
    Perform a CSS selection operation on the current element.

    This uses the SoupSieve library.

    :param selector: A string containing a CSS selector.

    :param namespaces: A dictionary mapping namespace prefixes
    used in the CSS selector to namespace URIs. By default,
    Beautiful Soup will use the prefixes it encountered while
    parsing the document.

    :param limit: After finding this number of results, stop looking.

    :param kwargs: Any extra arguments you'd like to pass in to
    soupsieve.select().

话虽如此,页面结构似乎与您想要获取的页面结构实际上不同,缺少 <a>标签。

<div id="buyNewSection" class="rbbHeader dp-accordion-row">
   <h5>
      <div class="a-row">
         <div class="a-column a-span4 a-text-left a-nowrap">
            <span class="a-text-bold">Buy New</span>
         </div>
         <div class="a-column a-span8 a-text-right a-span-last">
            <div class="inlineBlock-display">
               <span class="a-letter-space"></span>
               <span class="a-size-medium a-color-price offer-price a-text-normal">$16.83</span>
            </div>
         </div>
      </div>
   </h5>
</div>

所以这应该有效:

p = soup.select('#buyNewSection > h5 > div > div.a-column.a-span8.a-text-right.a-span-last > div.inlineBlock-display > span.a-size-medium.a-color-price.offer-price.a-text-normal')
abc = p[0].text.strip()
abc

Out[2]:
'$16.83'

此外,您可以考虑使用更精细的方法来更好地调试代码。例如:

buySection = soup.find('div', attrs={'id':'buyNewSection'})
buySpan = buySection.find('span', attrs={'class': 'a-size-medium a-color-price offer-price a-text-normal'})

print (buyScan)
Out[1]:
'$16.83'

关于python - beautifulsoup select 方法返回回溯,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57368957/

相关文章:

python lxml解析html

python - 如何在文件浏览器中使用动态模块?

python - 用于计算向量在不同点拆分的方差的向量化

python - 我的年份列表不适用于 BeautifulSoup。为什么?

python - 使用 Python 中的请求进行网页抓取 - 脚本响应

Python 和 Youtube Api

python - python-requests 一次失败后如何再发送一个请求?

python - 每个 Flask session 存储大量数据或服务连接

python - 在 Python 中抓取字符串的元素 &lt;script&gt;

python - 使用 BeautifulSoup 从表中提取选定的列