search-engine - Google 爬虫找到 robots.txt，但无法下载

谁能告诉我这个 robots.txt 有什么问题？

http://bizup.cloudapp.net/robots.txt

以下是我在 Google 网站站长工具中遇到的错误:

Sitemap errors and warnings
Line    Status  Details
Errors  -   
Network unreachable: robots.txt unreachable
We were unable to crawl your Sitemap because we found a robots.txt file at the root of
your site but were unable to download it. Please ensure that it is accessible or remove
it completely.

实际上上面的链接是一个 Action 机器人的路线映射。该操作从存储中获取文件并将内容作为文本/纯文本返回。 Google 表示他们无法下载该文件。是因为这个吗？

最佳答案

看起来它正在读取 robots.txt OK，但是您的 robots.txt 然后声称 http://bizup.cloudapp.net/robots.txt也是您的 XML 站点地图的 URL，当它真的是 http://bizup.cloudapp.net/sitemap.xml 时.该错误似乎来自 Google 试图将 robots.txt 解析为 XML 站点地图。您需要将 robots.txt 更改为

User-agent: *
Allow: /
Sitemap: http://bizup.cloudapp.net/sitemap.xml

编辑

它实际上比这更深入一点，Googlebot 根本无法下载您网站上的任何页面。以下是 Googlebot 请求 robots.txt 或主页时返回的异常:

Cookieless Forms Authentication is not supported for this application.

Exception Details: System.Web.HttpException: Cookieless Forms Authentication is not supported for this application.

[HttpException (0x80004005): Cookieless Forms Authentication is not supported for this application.]
AzureBright.MvcApplication.FormsAuthentication_OnAuthenticate(Object sender, FormsAuthenticationEventArgs args) in C:\Projectos\AzureBrightWebRole\Global.asax.cs:129
System.Web.Security.FormsAuthenticationModule.OnAuthenticate(FormsAuthenticationEventArgs e) +11336832
System.Web.Security.FormsAuthenticationModule.OnEnter(Object source, EventArgs eventArgs) +88
System.Web.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() +80
System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously) +266

FormsAuthentication 正在尝试使用无 cookie 模式，因为它识别出 Googlebot 不支持 cookie，但是您的 FormsAuthentication_OnAuthenticate 方法中的某些内容随后抛出异常，因为它不想接受无 cookie 身份验证。

我认为最简单的解决方法是更改 web.config 中的以下内容，这会阻止 FormsAuthentication 尝试使用无 cookie 模式...

<authentication mode="Forms"> 
    <forms cookieless="UseCookies" ...>
    ...

关于search-engine - Google 爬虫找到 robots.txt，但无法下载，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/3524031/

search-engine - Google 爬虫找到 robots.txt，但无法下载

上一篇：entity-framework - 如何重新创建我的EF代码第一表？

下一篇：wpf - 通知 WPF 中计算的数据绑定(bind)属性的更改