在线程中启动的 Python 的 SimpleHTTPServer 不会关闭端口

标签 python multithreading simplehttpserver

我有以下代码:

import os
from ghost import Ghost
import urlparse, urllib

import SimpleHTTPServer
import SocketServer

import sys, traceback

from threading import Thread, Event
from time import sleep

please_die = Event() # this is my enemy

httpd = None
PORT = 8001
address = 'http://localhost:'+str(PORT)+'/'
search_dir = './category'

def main():
    """
      basic run script routine, 
      FIXME: is supossed to exits gracefully
    """
    thread = Thread(target = simpleServe)
    try:
      thread.start()
      run()
    except KeyboardInterrupt:
      print "Shutdown requested"
    except Exception:
      traceback.print_exc(file=sys.stdout)

    shutdown()
    sys.exit(0)

def shutdown():
  global httpd
  global please_die
  print "Shutting down"
  # A try - except for the shutdown routine
  try:
    please_die.wait() # how do you do? 

    httpd.shutdown() # Please! I whant to run you multiple times. 
    print "Have you died?"
  except Exception:
    traceback.print_exc(file=sys.stdout)

def path2url(path):
  """
  constructs an url from a relative path / concatenates the global address
  variable with the path given
  """
  global address
  return urlparse.urljoin(address, urllib.pathname2url(path))

def simpleServe():
  global httpd, PORT

  please_die.set() # Attaching the event to this thread

  # Start the service
  Handler = SimpleHTTPServer.SimpleHTTPRequestHandler

  httpd = SocketServer.TCPServer(("", PORT), Handler)

  print "serving at port", PORT
  # And loop infinetly in the hope that I can stop you later
  httpd.serve_forever()

def run():
  global search_dir;

  ghost = Ghost() # the webkit facade

  with ghost.start() as session:

    session.set_viewport_size(2560, 1600) # "retina" size

    for directory, subdirectories, files in os.walk(search_dir):
        for file in files:
            path = os.path.join(directory, file)
            urlPath = path2url(path)
            process(session, urlPath);

def process(session, urlPath):
  page, resources = session.open(urlPath)
  assert page.http_status == 200
  # ... other asserts here 


if __name__ == '__main__':
  main()

想法是制作一个启动“简单的 http 服务器”的脚本,对其执行一些请求然后退出。

第一次运行没有任何问题:

...
127.0.0.1 - - [31/Jul/2015 13:16:17] "GET /category/52003.html HTTP/1.1" 200 -
127.0.0.1 - - [31/Jul/2015 13:16:17] "GET /category/52003.html HTTP/1.1" 200 -
127.0.0.1 - - [31/Jul/2015 13:16:17] "GET /category/52003.html HTTP/1.1" 200 -
127.0.0.1 - - [31/Jul/2015 13:16:17] "GET /static/img/glyphicons-halflings.png HTTP/1.1" 200 -
Shutting down
Have you died?

第二次启动它时崩溃说:

Address already in use

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "download-images.py", line 51, in simpleServe
    httpd = SocketServer.TCPServer(("", PORT), Handler)
  File "/usr/lib/python2.7/SocketServer.py", line 420, in __init__
    self.server_bind()
  File "/usr/lib/python2.7/SocketServer.py", line 434, in server_bind
    self.socket.bind(self.server_address)
  File "/usr/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 98] Address already in use

如果我终止了所有 python 进程,脚本再次运行,因此我假设我使用了错误的线程,但我找不到位置。

更新

忘记说了

我的操作系统是:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 15.04
Release:        15.04
Codename:       vivid

我正在使用的 python 是:

$ python --version
Python 2.7.9

$ 网络统计 -putelan | grep 8001 打印:

$ netstat -putelan | grep 8001
(Not all processes could be identified, non-owned process info
    cp        0      0 127.0.0.1:34691         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:8001          127.0.0.1:34866         TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34798         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:8001          127.0.0.1:34588         TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34647         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34915         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34674         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34451         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:8001          127.0.0.1:34930         TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:8001          127.0.0.1:34606         TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34505         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:34717         127.0.0.1:8001          TIME_WAIT   0          0           -               
    tcp        0      0 127.0.0.1:8001          127.0.0.1:34670         0      0 127.0.0.1:8001          127.0.0.1:34626         
...

我无法发布整个序列(由于 stackoverflow 的发布限制)。其余同理,34***端口和8001端口统一顺序混合。

最佳答案

正如@LFJ 所说,这可能是由于 allow_reuse_address TCPServer 的属性.

httpd = SocketServer.TCPServer(("", PORT), Handler, bind_and_activate=False)
httpd.allow_reuse_address = True

try:
    httpd.server_bind()
    httpd.server_activate()
except:
    httpd.server_close()
    raise

等效代码:

SocketServer.TCPServer.allow_reuse_address = True
https = SocketServer.TCPServer(("", PORT), Handler)

让我们解释一下原因。

当您启用 TCPServer.allow_reuse_address 时,它在套接字上添加了一个选项:

class TCPServer:
    [...]
    def server_bind(self):
        if self.allow_reuse_address:
            self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        [...]

什么是 socket.SO_REUSEADDR

This socket option tells the kernel that even if this port is busy (in
the TIME_WAIT state), go ahead and reuse it anyway.  If it is busy,
but with another state, you will still get an address already in use
error.  It is useful if your server has been shut down, and then
restarted right away while sockets are still active on its port.  You
should be aware that if any unexpected data comes in, it may confuse
your server, but while this is possible, it is not likely.       

事实上,它允许重用你的套接字套接字绑定(bind)地址。如果另一个进程在套接字未监听时尝试绑定(bind),则该进程将被允许使用此套接字绑定(bind)地址。

你需要启用它的原因是你没有正确关闭你的 TCPServer .为了正确关闭它,您必须运行 shutdown方法,它将关闭 server_forever 发起的线程然后通过调用 server_close 正确关闭套接字方法。

def shutdown():
    global httpd
    global please_die
    print "Shutting down"

    try:
        please_die.wait() # how do you do? 
        httpd.shutdown() # Stop the serve_forever
        httpd.server_close() # Close also the socket.
    except Exception:
        traceback.print_exc(file=sys.stdout)

关于在线程中启动的 Python 的 SimpleHTTPServer 不会关闭端口,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31745040/

相关文章:

java - 在 netty 中,可以将池化的 ByteBuf 传递给另一个线程吗?

python - 如何从 SimpleHTTPServer 获取客户端 IP

python - 如何为我的 python 脚本创建一个本地网络服务器?

java - spring 中的单例作用域 bean,我如何使其成为非线程安全

python - 修改 2d-numpy 数组主对角线的一部分

python - 如何获取 pandas 中当前时间和向前 15 秒之间的差异?

python - 错误未绑定(bind)方法 get() 必须使用 phantomJs 通过 WebDriver 调用

java - new Thread(Runnable runnableObj) 与 new Thread(Runnable runnableObj) 对比扩展线程

python - 可以使用 Python SimpleHTTPServer 或 cgi 上传文件吗?

python - opencv 3 beta/python 中的 findContours 和 drawContours 错误