python - 当任何线程完成任务时终止多个线程

我对 python 和线程都是新手。我编写了 python 代码，它充当网络爬虫并在网站上搜索特定关键字。我的问题是，如何使用线程同时运行我的类的三个不同实例。当其中一个实例找到关键字时，所有三个实例都必须关闭并停止抓取网络。这是一些代码。

class Crawler:
      def __init__(self):
            # the actual code for finding the keyword 

 def main():  
        Crawl = Crawler()

 if __name__ == "__main__":
        main()

如何使用线程让 Crawler 同时进行三种不同的爬取？

最佳答案

似乎没有一种(简单的)方法可以终止 Python 中的线程。

这是一个并行运行多个 HTTP 请求的简单示例:

import threading

def crawl():
    import urllib2
    data = urllib2.urlopen("http://www.google.com/").read()

    print "Read google.com"

threads = []

for n in range(10):
    thread = threading.Thread(target=crawl)
    thread.start()

    threads.append(thread)

# to wait until all three functions are finished

print "Waiting..."

for thread in threads:
    thread.join()

print "Complete."

如果有额外的开销，您可以使用 multi-process更强大的方法，允许您终止类似线程的进程。

我已经扩展了示例以使用它。希望对您有所帮助:

import multiprocessing

def crawl(result_queue):
    import urllib2
    data = urllib2.urlopen("http://news.ycombinator.com/").read()

    print "Requested..."

    if "result found (for example)":
        result_queue.put("result!")

    print "Read site."

processs = []
result_queue = multiprocessing.Queue()

for n in range(4): # start 4 processes crawling for the result
    process = multiprocessing.Process(target=crawl, args=[result_queue])
    process.start()
    processs.append(process)

print "Waiting for result..."

result = result_queue.get() # waits until any of the proccess have `.put()` a result

for process in processs: # then kill them all off
    process.terminate()

print "Got result:", result

关于python - 当任何线程完成任务时终止多个线程，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6286235/

python - 当任何线程完成任务时终止多个线程

上一篇：python - Python 2.7 之前的 dict 理解的替代方案

下一篇：python - 你如何检查一个python方法是否被绑定(bind)？