python - 为什么我不能将 urllib2.urlopen 用于维基百科网站?

标签 python networking network-programming wikipedia

<分区>

Possible Duplicate:
Fetch a Wikipedia article with Python

>>> print urllib2.urlopen('http://zh.wikipedia.org/wiki/%E6%AF%9B%E6%B3%BD%E4%B8%9C').read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open
    response = meth(req, response)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 513, in http_response
    'http', request, response, code, msg, hdrs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 438, in error
    return self._call_chain(*args)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 521, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

最佳答案

您需要提供用户代理 else you'll get a 403 ,就像你一样。

On Wikimedia wikis, if you don't supply a User-Agent header, or you supply an empty or generic one, your request will fail with an HTTP 403 error. See our User-Agent policy. Other MediaWiki installations may have similar policies.

因此只需在您的代码中添加一个用户代理,它应该可以正常工作。

关于python - 为什么我不能将 urllib2.urlopen 用于维基百科网站?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11814287/

相关文章:

Python 将日期和时间转换为 pandas 索引

python - 如何使用 fabric 在任意主机、多个平台、Linux 和 Windows 上创建文件夹/目录?

java - 重复使用 readUTF() 后,ObjectInputStream 抛出 EOFException

c - 嵌入式域中 strlen() 的缺点

python - Pandas:将 timedelta 列舍入为 15 秒

Python 2.7 : Faster/better way to extract all integer values from a string?

c - 避免 P2P 网络架构中同时双向连接

postgresql - 查找运行它的 PostgreSQL 服务器主机名

Java:每台机器上每次调用的反射都相同吗?

http - net.Dialer#KeepAlive和http.Transport#IdleTimeout有什么区别?