seo - 如何允许爬虫只访问 index.php，使用 robots.txt？

如果我只想让爬虫访问 index.php，这行得通吗？

User-agent: *
Disallow: /
Allow: /index.php

最佳答案

是的，它会起作用。这是来自 Google Webmaster Tool 的测试结果.

Url
http://www.example.org/index.php

Googlebot
Allowed by line 3: Allow: /index.php

Googlebot-Mobile
Allowed by line 3: Allow: /index.php

但是，请记住，如果使用此配置，您的网站主页将不会被抓取，除非使用完全限定路径访问该页面。换句话说，http://www.example.org/ 被禁止，而 http://www.example.org/index.php 被允许。

如果您希望您的主页可以访问，这里有一个更好的文件版本。

User-agent: *
Disallow: /
Allow: /index.php
Allow: /$

关于seo - 如何允许爬虫只访问 index.php，使用 robots.txt？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/1637620/

相关文章：

php - file_get_contents 无限客户端页面刷新