python - 异常类型: MissingSchema/beautifulsoup

标签 python django beautifulsoup

我正在 django 中探索 beautifulsoup,它工作得很好,直到我添加一个变量作为 url(来自带有 URLValidator 的模型文本字段)

我在 model_url 之前添加了一个“https://”+,但这也给出了一个错误。可能是什么问题?我用谷歌搜索了很多,但没有任何效果。希望这不是一个双重问题。谢谢前进!<​​/p>

这是我的模型.py

class ScrapeUrl(models.Model):
product_title = models.CharField(max_length=255)
product_ean = models.CharField(max_length=25)
scrape_url = models.TextField(validators=[URLValidator()])
shop_price = models.DecimalField(max_digits=10, decimal_places=2)

def __str__(self):
    return self.product_title

def __unicode__(self):
    return unicode(self.product_title) or u''

这是我的观点.py

def scrape_list_view(request):

model_url = ScrapeUrl.scrape_url

response = requests.get(model_url)

soup = bs4.BeautifulSoup(response.text)

price = soup.find("span", {"class": "promo-price"}).text

price_dot = price.replace(",",".").replace('-','0')
price_break = price_dot.replace('\r', '').replace('\n', '').replace(' ','')
price_data = float(price_break)

return render(request, 'scrape_list.html', {'price_data': price_data})

这是回溯

Environment:


Request Method: GET
Request URL: http://localhost:8000/app/

Django Version: 1.11.13
Python Version: 2.7.14
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'mathfilters',
 'scrapeapp']
Installed Middleware:
['django.middleware.security.SecurityMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware']



Traceback:

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/django/core/handlers/exception.py" in inner
  41.             response = get_response(request)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response
  187.                 response = self.process_exception_by_middleware(e, request)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response
  185.                 response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/Users/sanderhegeman/scraper/scrapeapp/views.py" in scrape_list_view
  18.   response = requests.get(good_url)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/api.py" in get
  72.     return request('get', url, params=params, **kwargs)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/api.py" in request
  58.         return session.request(method=method, url=url, **kwargs)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/sessions.py" in request
  494.         prep = self.prepare_request(req)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/sessions.py" in prepare_request
  437.             hooks=merge_hooks(request.hooks, self.hooks),

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/models.py" in prepare
  305.         self.prepare_url(url, params)

File "/Users/sanderhegeman/scraper/lib/python2.7/site-packages/requests/models.py" in prepare_url
  379.             raise MissingSchema(error)

Exception Type: MissingSchema at /app/
Exception Value: Invalid URL '<django.db.models.query_utils.DeferredAttribute object at 0x10378b050>': No schema supplied. Perhaps you meant http://<django.db.models.query_utils.DeferredAttribute object at 0x10378b050>?

最佳答案

问题是您正在尝试检索 ScrapeUrl.scrape_url 。这不是一个字符串,它是 Django 模型的一个属性。当您将其传递给requests.get时,它将其转换为字符串表示形式,类似于

'<django.db.models.query_utils.DeferredAttribute object at 0x10378b050>'

这显然不是一个有效的网址,所以这就是您收到该异常的原因。您可能想要根据查询参数或 url 路径从数据库检索对象。为此,您可以执行类似的操作

model_url = ScrapeUrl.objects.get(pk=int(request.query_params['id'])).scrape_url

请注意,如果查询参数不存在或者它不是整数或者相应的对象不在数据库中,这可能仍然会失败。

关于python - 异常类型: MissingSchema/beautifulsoup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50281247/

相关文章:

python - 克罗地亚字符和 python

django - 如何在TemplateTag中获取request.user

python - Django all-auth 删除在注册表单中两次询问密码

python - AttributeError : 'unicode' object has no attribute 'fromstring' . 如何解决这个问题?

python - 无法分析 txt 文件 python 的输出 - ValueError : I/O operation on closed file

python - 基于匹配对象的字符串替换 (Python)

python - 在 Scipy 中,curve_fit 如何以及为什么计算参数估计的协方差

Django 站点 : Where to store SQL setup commands?

python - 更改 HTML 标记内的属性以查看完整内容 Python BeautifulSoup

python - 网页抓取 python 不返回任何内容