regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错

标签 regex apache .htaccess seo

我正在尝试为我的主干应用程序使用预渲染服务,让我的页面在谷歌上编入索引。

当我专门将 googlebot 添加到用户代理列表时,我知道设置工作正常,但有人建议我不要这样做,而建议使用 _escaped_fragment_ 方法。唯一的问题是 _escaped_fragment_ 参数没有正确传递。可以帮忙吗?

谢谢!!!

    # html5 pushstate (history) support:

<ifModule mod_rewrite.c>

    RewriteEngine On

    RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
    RewriteCond %{HTTPS} !on
    RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L] 

# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
# Go to it as is
    RewriteRule ^ - [L]

  # If non existent
  # If path ends with / and is not just a single /, redirect to without the trailing /
    RewriteCond %{REQUEST_URI} ^.*/$
    RewriteCond %{REQUEST_URI} !^/$
    RewriteRule ^(.*)/$ $1 [R,QSA,L]      

  # Handle Prerender.io
    RequestHeader set X-Prerender-Token "xxxxxxxx"

    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR] 
    RewriteCond %{QUERY_STRING} _escaped_fragment_

# Proxy the request
    RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/https://www.example.com/$2 [P,L]

  # If non existent

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_URI} !index
    RewriteRule (.*) index.html [L,QSA]


</ifModule>

所有的 apache 模块都已加载并正常工作。

最佳答案

所以 .htaccess 实际上是正确的...这里是 Google 的官方答案。

引自 http://productforums.google.com/forum/#!category-topic/webmasters/crawling-indexing--ranking/bZgWCJTnl08%5B1-25%5D作者:John Mueller(谷歌员工)

Looking at your blog's homepage, one thing to keep in mind is that the Fetch
as Googlebot feature does not parse the content that it fetches. So when you
submit toddmoyer.net/blog/ , it fetches that URL. After fetching the URL, it
doesn't parse it to check for the "fragment" meta tag, it just returns it to
you. However, if you fetch toddmoyer.net/blog/#! , then it should rewrite the
URL and fetch the URL toddmoyer.net/blog/?_escaped_fragment_= .

When we crawl and index your pages, we'll notice the meta-tag and act
accordingly. It's just the Fetch as Googlebot feature that doesn't check for
meta-tags, and instead just returns the raw content.

关于regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28420212/

相关文章:

c# - 使用正则表达式在匹配前插入字符

JavaScript RegEx - 如果在字符串内部则阻止捕获

apache - Hadoop JobTracker内存使用率增加v

apache - 浏览器下载 index.php 文件而不打开页面

php - 服务器IP不断重定向到defaultwebpage.cgi

regex - 在 htaccess 中添加 ID 号

php - 将某些访问者重定向到 https(SNI 浏览器和系统时间都可以)

javascript - 在文本区域中,键入时将两个\n 替换为制表符缩进

javascript - 使用正则表达式验证 Javascript 上的货币不起作用

php - htaccess force SSL 违反其他规则