nginx - php5-fpm + nginx + Google bot = 由对等方重置连接

标签 nginx php googlebot

所以我花了一个疯狂的时间来弄清楚为什么在过去的几个小时里,我的日志每分钟都会出现几次缓慢的 PHP 脚本警告。

我最初专注于 PHP 慢速日志和 PHP 错误日志,吓坏了以为这是我的代码。碰巧我正在实现一些 DNS 调整,这就是为什么我被误导了。

我最终检查了 nginx 错误日志,它显示一行又一行的连接被来自几乎相同 IP 的对等方重置。

我用谷歌搜索了 IP,发现它属于 Google,所以这显然是访问该网站的 Google 机器人/蜘蛛程序。

这是错误日志的片段

2013/06/06 14:04:05 [error] 12313#0: *7435269 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.187, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:04:05 [error] 12308#0: *7435135 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.167, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:04:05 [error] 12308#0: *7435994 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.199, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:04:12 [error] 12309#0: *7436209 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.168, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:05:12 [error] 12309#0: *7441608 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.177, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:05:15 [error] 12310#0: *7440634 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.219, server: www.domain.com, request: "GET /c.html?q= xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:05:15 [error] 12313#0: *7441634 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.194, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:06:02 [error] 12310#0: *7444721 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.221, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:06:05 [error] 12308#0: *7443911 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.203, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:06:05 [error] 12309#0: *7445423 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.164, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"
2013/06/06 14:06:05 [error] 12310#0: *7445640 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 209.85.238.222, server: www.domain.com, request: "GET /c.html?q=xyz HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.domain.com"

连接被peer重置的原因是什么。 Google 机器人真的会访问页面并终止请求,只是为了检查它是否存在吗?

这不是很好,因为它调用了我的 curl 请求,然后由于客户端退出,这些请求属于孤立线程。这意味着它们只是超时导致 PHP 脚本变慢。

还是我读错了?

最佳答案

如果您查看错误消息,它说

while reading response header from upstream

这意味着问题不是 google 正在终止请求,而是 nginx 的上游,恰好是 php-fpm,正在终止请求。通常,这是由正在运行的 php 代码错误引起的。

鉴于我们没有代码,这里有一些常规的故障排除步骤:

  • 在 php-fpm 的配置中,增加 request_terminate_timeoutmax_input_timemax_execution_time 的值。
  • 在 php.ini 或 pool.conf 配置文件中激活错误日志记录(但不是“display_error”,如果它是生产网站)。
  • 尝试在正在运行的代码上运行调试器(xdebug 非常有用)来单步调试代码,您会偶然发现大多数问题。

关于nginx - php5-fpm + nginx + Google bot = 由对等方重置连接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16969903/

相关文章:

php - 哪个是首选,关联数组还是对象?

PHP 在 foreach 中创建键 => 值对

seo - 如何证明 Google SERP 上的特定页面?

node.js - Socket.io + Nginx + AWS Application Load Balancer 连接在建立之前关闭

django - 如何配置 nginx 以将 mp4 视频正确提供给 Safari?

ssl - Nginx 自签名证书不适用于 Vagrant 虚拟机

php - 检查字符串是否包含任何文本

python - 有选择地索引子域

robots.txt - googlebot 会索引我的网站吗?

nginx - 我可以将 Nginx 作为代理放在其他代理池之前吗?