我有一些代码用于检查网站上的链接并尝试使其“线程化”,代码已更新为使用 pcntl_fork()。
父代码适用于 SSL 和非 SSL URL,但子代码仅适用于非 SSL URL。我在代码中指出了它有效的地方和无效的地方。
这是我的分支代码。我知道下面的代码将永远循环,我已经删除了循环控制代码以使其更具可读性。
$this->initialize_curl();
$this->connect_database();
// prime the queue
$this->add_url_to_queue($this->source_url, 0, 0);
$this->process_next_url_in_queue($this->get_next_url_in_queue());
// SSL and non-SSL work at this point
// loop until we have processed all URL's
while (1) {
$url = $this->get_next_url_in_queue();
// disconnect from the database before forking since we don't want to
// share the database connection with child processes - the first one
// will close it and ruin the fun for the other children.
curl_close($this->ch);
$this->db->close();
// create child
$pid = pcntl_fork();
// handle forked processing
switch ($pid) {
// error
case -1:
print "Could not fork\n";
exit;
// child
case 0:
// seperate database and curl for the child
$this->connect_database();
$this->initialize_curl();
// process the url
$this->process_next_url_in_queue($url);
// only non-SSL works at this point
exit;
// parent
default:
// seperate database and curl for the parent
$this->connect_database();
$this->initialize_curl();
break;
}
}
如您所见,我必须打开和关闭数据库连接才能正常工作,我正在对 CURL 做同样的事情。这是 initialize_curl()
中的代码:
$this->ch = curl_init();
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, FALSE);
curl_setopt($this->ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($this->ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($this->ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($this->ch, CURLOPT_HEADER, FALSE);
我正在使用 CURLOPT_SSL_VERIFYPEER
和 CURLOPT_SSL_VERIFYHOST
因为没有它我的 SSL CURL 请求将失败。这是服务器设置的问题,我无法更改。
当子 CURL 是 SSL URL 时,我认为它失败了,因为设置这些选项时存在问题,但我不知道。如果我将 CURL 设置为详细,我会看到以下错误:
* About to connect() to HOST port 443 (#0)
* Trying IP... * connected
* Connected to HOST (IP) port 443 (#0)
* NSS error -8023
* Closing connection #0
* SSL connect error
请让我知道我可以做些什么来完成这项工作。
最佳答案
经过大量研究,我发现这个问题并不是一个新问题,而是 php 的 CURL 实现问题。这些其他问题帮助我想出了我在下面分享的解决方案:
我最后做的是使用 pcntl_exec,它用提供的命令替换当前的子进程。
$this->initialize_curl();
$this->connect_database();
// prime the queue
$this->add_url_to_queue($this->source_url, 0, 0);
$this->process_next_url_in_queue($this->get_next_url_in_queue());
// loop until we have processed all URL's
while (1) {
$url = $this->get_next_url_in_queue();
// disconnect from the database before forking since we don't want to
// share the database connection with child processes - the first one
// will close it and ruin the fun for the other children.
curl_close($this->ch);
$this->db->close();
// create child
$pid = pcntl_fork();
// handle forked processing
switch ($pid) {
// error
case -1:
print "Could not fork\n";
exit;
// child
case 0:
// seperate database and curl for the child
$this->connect_database();
$this->initialize_curl();
// process the url
pcntl_exec('process_next_url_in_queue.php', array($url));
exit;
// parent
default:
// seperate database and curl for the parent
$this->connect_database();
$this->initialize_curl();
break;
}
}
关于php - curl 和 pcntl_fork(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34901910/