Python 测试 url 和图像类型

在下面的代码中如何测试类型是url还是图片

for dictionaries in d_dict:
  type  = dictionaries.get('type')
  if (type starts with http or https):
    logging.debug("type is url")
  else if type ends with .jpg or .png or .gif
    logging.debug("type is image")
  else:
     logging.debug("invalid type")

最佳答案

您无法仅通过其 URL 判断资源的类型。在没有 .gif 文件扩展名或具有误导性文件扩展名(如 .txt)的 URL 上使用 GIF 文件是完全有效的。事实上，现在 URL 重写很流行，您很可能会得到根本没有文件扩展名的图像 URL。

控制 Web 资源类型的是 Content-Type HTTP 响应 header ，因此您可以确定的唯一方法是获取资源并查看您的响应得到。您可以通过查看 urllib.urlopen(url).headers 返回的 header 来执行此操作，但这实际上是获取文件本身。为了提高效率，您可能更愿意发出不传输整个文件的 HEAD 请求:

import urllib2
class HeadRequest(urllib2.Request):
    def get_method(self):
        return 'HEAD'

response= urllib2.urlopen(HeadRequest(url))
maintype= response.headers['Content-Type'].split(';')[0].lower()
if maintype not in ('image/png', 'image/jpeg', 'image/gif'):
    logging.debug('invalid type')

如果您必须尝试根据 URL 路径部分中的文件扩展名来嗅探类型(例如，因为您没有网络连接)，您应该使用 urlparse 解析 URL。首先删除任何 ?query 或 #fragment 部分，以便 http://www.example.com/image.png?blah=blah&foo=.txt 不会混淆它。您还应该考虑使用 mimetypes将文件名映射到 Content-Type，这样您就可以利用它的文件扩展名知识:

import urlparse, mimetypes

maintype= mimetypes.guess_type(urlparse.urlparse(url).path)[0]
if maintype not in ('image/png', 'image/jpeg', 'image/gif'):
    logging.debug('invalid type')

(例如，这样也允许其他扩展名。您至少应该允许 .jpeg 用于 image/jpeg 文件，以及突变的三 -字母 Windows 变体 .jpg。)

关于Python 测试 url 和图像类型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/3702331/

Python 测试 url 和图像类型

上一篇：python - 在方法签名中组合 **kwargs 和关键字参数的使用

下一篇：python - 脚本不会在 Python3.0 中运行