加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

How to serve AJAX pages (Ember.js, Angular, etc) to Google&a

发布时间:2020-12-16 00:42:21 所属栏目:百科 来源:网络整理
导读:There may / must be better ways but here we go. Recipes: Headless browser component,e.g. PhantomJS [1] is the preferred case because it's 1) v8,2) lightweight,comparing to the next choince Firefox + Xvfb. I had to use this one because my s

There may / must be better ways but here we go.

Recipes:

  1. Headless browser component,e.g.
    1. PhantomJS [1] is the preferred case because it's 1) v8,2) lightweight,comparing to the next choince
    2. Firefox + Xvfb. I had to use this one because my site breaks under PhantomJS (even if it works fine under Chrome)
  2. Selenium to drive the browser and generate the HTML.
  3. Web server that serves the bots.
As defined by Google [2],AJAX apps should use #! to indicate the bots that it's an AJAX page,and bots will try to look for ?_escaped_fragment_= URL for this AJAX address and expect a JavaScript-free page. So there must be something to run the JavaScripts,generate proper DOM for the dummy bots. Here comes in the headless browsers.

Xvfb is a special X server that runs (at least for me) on Linux and requires no interaction with graphics devices. It renders everything inside memory so can be run on headless servers like Amazon EC2 Linux servers easily. Firefox is the de facto for Linux,works pretty well with Xvfb,and is the default driver for Selenium so it's the definite choice.

Selenium was designed for browser based test automation. It can drives different browsers starting Firefox (with built-in support),Chrome and IE (both require extra "driver"s). In Python there's an API for Selenium but also there are easier APIs like Splinter,which is my choice.

If simply forwarding every URL to the firefox,we're loading a page 20 - 100x slower than actually loading in Firefox,because for each resources (CSS,JavaScript,Images) the server is actually starting a new Firefox tab (if not window) to retrieve that,while the first AJAX page would have loaded them once already. This is slow,so a hack is done here to load all static resources via Requests instead. Better optimisations available,though.

So everything together is at [3].

Good luck.

[1] PhantomJS http://phantomjs.org
[2] Making AJAX applications crawlable https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?hl=de-DE
[3] AJAX for dummies https://github.com/wolf0403/ajax-for-dummies

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读