regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错-6ren

regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错

转载作者：塔克拉玛干更新时间：2023-11-03 02:44:38

25

4

我正在尝试为我的主干应用程序使用预渲染服务，让我的页面在谷歌上编入索引。

当我专门将 googlebot 添加到用户代理列表时，我知道设置工作正常，但有人建议我不要这样做，而建议使用 _escaped_fragment_ 方法。唯一的问题是 _escaped_fragment_ 参数没有正确传递。可以帮忙吗？

谢谢!!!

    # html5 pushstate (history) support:

<ifModule mod_rewrite.c>

    RewriteEngine On

    RewriteCond %{HTTP_HOST} ^example\.com$ [OR]
    RewriteCond %{HTTPS} !on
    RewriteRule ^(.*)$ https://www.example.com/$1 [R=301,L] 

# If requested resource exists as a file or directory
# (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
# Go to it as is
    RewriteRule ^ - [L]

  # If non existent
  # If path ends with / and is not just a single /, redirect to without the trailing /
    RewriteCond %{REQUEST_URI} ^.*/$
    RewriteCond %{REQUEST_URI} !^/$
    RewriteRule ^(.*)/$ $1 [R,QSA,L]      

  # Handle Prerender.io
    RequestHeader set X-Prerender-Token "xxxxxxxx"

    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR] 
    RewriteCond %{QUERY_STRING} _escaped_fragment_

# Proxy the request
    RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent|\.ttf|\.woff))(.*) http://service.prerender.io/https://www.example.com/$2 [P,L]

  # If non existent

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_URI} !index
    RewriteRule (.*) index.html [L,QSA]


</ifModule>

所有的 apache 模块都已加载并正常工作。

最佳答案

所以 .htaccess 实际上是正确的...这里是 Google 的官方答案。

引自 http://productforums.google.com/forum/#!category-topic/webmasters/crawling-indexing--ranking/bZgWCJTnl08%5B1-25%5D作者:John Mueller(谷歌员工)

Looking at your blog's homepage, one thing to keep in mind is that the Fetch
as Googlebot feature does not parse the content that it fetches. So when you
submit toddmoyer.net/blog/ , it fetches that URL. After fetching the URL, it
doesn't parse it to check for the "fragment" meta tag, it just returns it to
you. However, if you fetch toddmoyer.net/blog/#! , then it should rewrite the
URL and fetch the URL toddmoyer.net/blog/?_escaped_fragment_= .

When we crawl and index your pages, we'll notice the meta-tag and act
accordingly. It's just the Fetch as Googlebot feature that doesn't check for
meta-tags, and instead just returns the raw content.

关于regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28420212/

25

4

0

文章推荐： .htaccess - 使用 Htaccess 文件创建 SEO 友好的 URL

文章推荐： algorithm - 计算 VBA 中所选(大)范围内不同值的数量？

文章推荐：算法复杂度

文章推荐： seo - 代码示例是否被搜索引擎索引？

javascript - 谷歌不使用 _escaped_fragment_ 作为主页
我有基于 javascript( Angular )的网站。我还有每个页面的代理页面，这些页面通过 _escaped_fragment_ 参数对谷歌爬虫可见。所有 redirectis 工作正常。我不
ajax - Apache "_escaped_fragment_"重定向
我在重写 Apache 时遇到了问题，由于某些原因，上面的第一条规则在没有标志“R”的情况下不工作。但我想在内部进行重定向。 # FROM http://mysite.ru # TO
meteor :localhost:3000/?_escaped_fragment_= 为空
我试图让爬虫可以看到我的 meteor 应用程序。我添加了 Spiderable 包并安装了 Phantom.js。当我去 localhost:3000/?_escaped_fragment_= 时，
seo - hashHistory、_escaped_fragment_ 和 Google
我使用 React Router 已经有一段时间了，我一直在使用 hashHistory 来处理路由。在某些时候，我打算将应用程序转换到 browserHistory，但我很好奇为什么 Google
ajax - 我如何删除 ?_escaped_fragment_= 使用 .htaccess
Google 发现我允许最终用户使用 ajax 加载来浏览我的内容，并且将我的页面作为用户客户端加载，而不是在新页面加载时请求它们。因此，它不是尝试索引 www.mysite.com/page，而是请
angularjs - 在 AppEngine 的根路径上处理 _escaped_fragment_
我正在使用 Google App Engine 托管一个带有 Python 网络服务的 AngularJS 应用程序。根路径/设置为返回 index.html，/api/* 请求转到 Python C
php - 如何处理 AJAX 爬虫的 ?_escaped_fragment_=？
我正在努力使基于 AJAX 的网站对 SEO 友好。按照网络教程中的建议，我添加了“漂亮”href链接的属性:контакт并且，在默认情况下使用 AJAX 加载内容的 div 中，用于爬虫的 PHP
ajax - 谷歌索引 : _escaped_fragment_ not working for home page
我确实将我的网站 (GWT) 设置为可被 Google 抓取。在 Google webmastertool 上使用“fetch as google”页面时，我看到以下模式: 访问“http://www
php - 应该 _escaped_fragment_ 返回一个完整的页面内容 [Ajax SEO ]
我正在尝试使用 ajax 来理解 SEO ...我所知道的是我所有的 ajax 链接都必须是(漂亮的 url)格式为 #!anything 的东西因此，当机器人阅读我的页面并找到任何#! url 尝试
php - htaccess 中带有 _escaped_fragment_ 的可抓取 AJAX
各位开发者大家好! 我们即将完成 ajax 网络应用程序第一阶段的开发。在我们的应用程序中，我们使用的哈希片段如下: http://ourdomain.com/#!list=last_ads&orde
ajax - Facebook 在分享时用 "_escaped_fragment_"替换 #!v 网址
我正在尝试在 Facebook 上通过我的应用分享此 URL: http://www.example.com/#!v;id=NH1NlYov3bKJ 但是，它会自动替换为: http://www.ex
ajax - 这个 AJAX URL 的 _escaped_fragment_ 请求会是什么样子？
假设我有一个带有此 URL 的 AJAX 应用程序:http://www.foo.com/bar#!a=1&b=2&c=3 crawable AJAX会是什么？来自 GoogleBot 的请求是什么样
ajax - 当 Googlebot 请求 `?_escaped_fragment_=` URL 时，它如何知道网络服务器没有隐藏？
关于 Google 的 AJAX 抓取规范，如果服务器为 #! 返回一件事(即一个 JavaScript 重的文件)当 #! 发送到 Googlebot 的 URL 和其他内容(即页面的“html 快
c - 使用 ?_escaped_fragment_= 向 Google 机器人提供 index.html
我有一个 JavaScript 代码量很大的应用程序，我希望对其进行索引。我的网站上有一个快照目录，可以将其提供给 Google 机器人。这些位于 mysite.com/snapshots/ 下。我
regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错
我正在尝试为我的主干应用程序使用预渲染服务，让我的页面在谷歌上编入索引。当我专门将 googlebot 添加到用户代理列表时，我知道设置工作正常，但有人建议我不要这样做，而建议使用 _escaped
ajax - AJAX 站点的 Google 索引 : how to transition from _escaped_fragment_ method?
我的网站目前使用 hashbang URLs 和 Google 弃用的建议，即在使用 _escaped_fragment_ 查询参数请求时提供静态页面。使用弃用方法的静态预生成页面示例: https
javascript - Google SEO 和 _escaped_fragment_ 根据 Google 的抓取变化
Google 刚刚完成(我现在看到页面刷新在我面前)JavaScript 索引。这很酷，因为我不再需要我所有的工具了。 Google 现在将执行 JavaScript - SEO JavaScript
java - 爬虫将 "mydomain#!article"转义为 "mydomain?_escaped_fragment_=article"，如何找回原来的url？
好的，这就是 Google 所说的 ( https://developers.google.com/webmasters/ajax-crawling/docs/getting-started )。当
ajax - 当链接重定向到 ajax 链接时，google 爬虫是否会使用 ajax _escaped_fragment_ 格式？
我知道我可以反其道而行之，并让 server.com/#!/mystuff 成为可 ajax 抓取的，但我想知道是否可以反其道而行之。如果我有 server.com/mystuff 并将重定向发送到
facebook - 将 Facebook Scraper 重定向到/?_escaped_fragment_=，使用 HTML5 历史 URL(无 hashbang)获取 AJAX 内容
如果您使用 hashbang URL，则 /#!/path/to/content ，Facebook 抓取工具(以及 Googlebot)将自动转发至 /?_escaped_fragment_=/pa

首页

博学

6Ren·AI

商城

regex - 使用 _escaped_fragment_ 获取 .htaccess 以引导 googlebot 时出错