Documentation

Url Structure

Javascript driven websites can have one of three different types of url structure.

Hash

www.yoursite.com/#category/page1

This was the first url structure used by AJAX driven websites. It is still the structure that most javascript frameworks default to today.

However, hash urls cannot be used for SEO (through BromBone or through any other method). The part of the url after the hash isn't sent to the server. So, there is no way for the server to know what route is being requested and send the proper snapshot in response.

If you are using hash urls, you will need to change to one of the two urls structures outlined below. Thankfully, most javascript frameworks make this easy. Normally just a one line change.

Hashbang

www.yoursite.com/#!category/page1

The part of the url after the hash isn't sent to the server. Hashbang urls provide a workaround for this problem.

The hashbang (the #!) is a signal to bots that the server needs the information after the #!. The bots accomplish this by putting the information after the #! into a query parameter called _escaped_fragment_.

When a bot see a page with this url:

www.yoursite.com/#!category/page1

It sees the #! and requests this url instead

www.yoursite.com/?_escaped_fragment_=category/page1

Now all the information is getting to your webserver and you can respond to the appropriate snapshot.

HTML5 pushState

www.yoursite.com/category/page1

Hash urls and Hashbang urls were workarounds. HTML5 pushState urls are the new standard. PushState is a new HTML5 feature that allows javascript to modify the url without causing a page refresh.

If you are using pushState urls, then you don't have to worry about the hash part of the url not getting sent to the server.

We recommend pushState urls for all new development. However, it is important to note and IE9 and older versions of IE don't support
pushState. We recommend falling back to full page reloads for those browsers. However, most frameworks also support falling back to hash
urls for those browsers if you prefer.

Sitemap XML

The sitemap format

The easiest way to tell BromBone what urls to generate html snapshots for, is by using a sitemap xml following the sitemaps
spec
. This is the same file that Google and other search engines can use to find all your pages.

Your sitemap will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>http://www.example.com/</loc>
        <lastmod>2014-03-04T23:20:28Z</lastmod>
        <changefreq>daily</changefreq>
    </url>
    <url>
        <loc>http://www.example.com/page1</loc>
        <lastmod>2014-03-04T23:20:28Z</lastmod>
        <changefreq>monthly</changefreq>
    </url>
</urlset>

A sitemap is limited to 50,000 urls. If you have more than that, you will need to create multiple sitemap files and list them in a sitemap index file that will look something like this:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <sitemap>
        <loc>http://www.example.com/sitemap1.xml</loc>
    </sitemap>
    <sitemap>
        <loc>http://www.example.com/sitemap2.xml</loc>
    </sitemap>
</urlset>

BromBone will only process urls that are explicitly called out in the sitemap that you submit (or their child sitemaps).

Encoding

Sitemap Monitoring

BromBone automatically monitors your sitemaps, generates new snapshots for new pages, and keeps snapshots fresh for old pages.

Every six hours we open your sitemaps and look at all the urls listed in them.

All plans work under the assumption that, on average, your site won't request more than 20% of your total pages to be refreshed per day.

If you need faster refreshes or guaranteed refresh rates, talk to us about Enterprise Plans.

Sitemap monitoring can be customized for Enterprise Plans. However, many Enterprise plans only include generating snapshots for new files and when the lastmod tag is updated. When updating the lastmod tag is possible, it is much more efficient and gives you more control.

REST API

Our REST API is available with Enterprise plans

Render

POST https://api.brombone.com/snapshot

Body Parameters (json):
site_name: string, assigned to you by BromBone
api_key: string, assigned to you by BromBone
url: string, full url to create the snapshot from
callback_url: url to call when the snapshot is generated or when

an error occurs

Callback Body Parameters (json):
request_url: The url that you requested we create the snapshot from
snapshot_url: The url of the generated snapshot (only if successful)
error: A description of the error (only if an error occured)
time_to_complete: Time in seconds to generate the snapshot

Example:

curl -X POST -H "Content-Type: application/json" -d
'{"site_name":"example_site","api_key":"jdkiplkmdhvnqsue","url":"https://www.yoursite.com/page","callback_url":"https://www.yoursite.com/brombone_callback"}' https://api.brombone.com/snapshot

How quickly your snapshots are available will depend on your needs and your plan. BromBone can have snapshots ready in < 10 seconds if
needed.

Sitemap Management

GET https://api.brombone.com/sitemaps/SITE_NAME?api_key=YOUR_API_KEY

URL parameters:
SITE_NAME: string, assigned to you by BromBone

Query parameters:
api_key: string, assigned to you by BromBone

Example

curl -X GET
'https://api.brombone.com/sitemaps/example_site?api_key=jdkiplkmdhvnqsue'

PUT https://api.brombone.com/sitemaps/SITE_NAME/SITEMAP_URL

URL parameters:
SITE_NAME: string, assigned to you by BromBone
SITEMAP_URL: string, full url of the sitemap to add including the protocol

Body parameters (json):
api_key: string, assigned to you by BromBone

Example

curl -X PUT -H "Content-Type: application/json" -d
'{"api_key":"jdkiplkmdhvnqsue"}'
'https://api.brombone.com/sitemaps/example_site/mysitemap.xml'

DELETE https://api.brombone.com/sitemaps/SITE_NAME/SITEMAP_URL

URL parameters:
SITE_NAME: string, assigned to you by BromBone
SITEMAP_URL: string, full url of the sitemap to delete

Body parameters (json):
api_key: string, assigned to you by BromBone

Example

curl -X DELETE -H "Content-Type: application/json" -d
'{"api_key":"jdkiplkmdhvnqsue"}'
'https://api.brombone.com/sitemaps/example_site/mysitemap.xml'

Snapshot Creation

The HTML snapshot creation is handled completely by BromBone. However, there are a few things you might want to know to customize your pages.

BromBone Header

All requests from BromBone contain this HTTP Header

'X-Crawl-Request': 'brombone'

Render Complete Timeout

By default, BromBone waits for 2 seconds after the page is loaded before saving the HTML snapshots. This gives AJAX calls time to return and gives javascript time to manipulate the DOM.

For most pages 2 seconds is sufficient. However, this can be adjusted based on your sites needs.

Timeouts work great for most sites. However a timeout can never guarantee that you site has finished rendering. For that guarantee, use a render complete class.

Render Complete Class

Instead of relying on a timeout, BromBone can wait for you to add the class "render-complete" to any element in the DOM. Once that class is added, the html snapshots will be created. If the "render-complete" class isn't added within 15 seconds of the page loading, no snapshot will be saved.

Javascript Function

After the timeout has passed or the render-complete class has been added (depending on your site settings), BromBone will call the
Brombone.beforeCaching() function on your page if that function exists.

You can use this function to make additional DOM manipulations or run tests to verify that the page rendered correctly.

This is good place to load any lazy load content such as content that loads on scroll. You can also hide system downtime notification, "video did not load" error messages, and other similar items.

If Brombone.beforeCaching() returns true, then snapshot will be saved. If Brombone.beforeCaching() returns false, then the snapshot will not be saved.

Error Callback

If there is an error creating one of your snapshots, BromBone will POST to a url of your choosing with a JSON body describing the error.

Proxy

When a bot visits your page, you need to serve the snapshot from BromBone instead of your normal content. How you do this will depend
on your existing server architecture and your url structure.

Identifying Bots

HTML5 pushState

If you are using HTML5 urls, then you will identify requests coming from bots by the userAgent header. We recommend doing a
case-insensitive check against this list.

(google|yahoo|bing|baidu|jeeves|facebook|twitter|linkedin)

When using this method you need to make sure you are only proxying requests for pages, not for images, stylesheet, or other assets.
Sites utilizing HTML5 urls already need have logic to serve the same html page for all routes, but not to serve that html page for when images, etc are requested. Sometimes this is done by excluding filename patterns. Sometimes static assets are placed in a particular path. Whatever method you use, you should use the same filter when setting up the proxy.

When a request is made to

http://www.yoursite.com/category/page1

You will proxy this url from BromBone

http://yoursite.brombonesnapshots.com/www.yoursite.com/category/page1

Hashbang

If you use the hashbang method, you will identify requests from bots because they will have the _escaped_fragment_ query parameter.

When a bot sees this url on your site

http://www.yoursite.com/#!category/page1

It will request

http://www.yoursite.com/?_escaped_fragment_=category/page1

You will proxy this page from BromBone

http://yoursite.brombonesnapshots.com/www.yoursite.com/%23!category/page1

Encoding

The part of the url after http://yoursite.brombonesnapshots.com/ should be url encoded just like the url in the sitemap with the following exceptions:

Proxying

When you signup, we will personally walk you through setting up the proxy. There are three basic options.

Webserver

The proxy is most often setup in the webserver (Apache, nginx, IIS, etc) layer using rewrite rules with a proxy flag.

Application Layer

If setting up the proxy in the webserver layer isn't an option, then you can do it in the application (Rails, nodejs, JAVA, ASP.NET, etc) layer instead. This is typically done with route filter or middleware.

CDN

If you use a CDN, it is often best to have the CDN determine which page should be shown. This will probably be done with a combination Edge Side Includes and Multiple Origin Servers. The particulars will depend on your CDN.

Proxy, not a Redirect

The most important thing is to make sure you setup a proxy and not a redirect. Your server needs to download the page from BromBone and pass the file on to the bot. This way the bot will see the page coming from you and your domain, not from BromBone's domain. You don't want to setup a 301 or 302 redirect.

BromBone Works with any Setup

We can get the proxy work working with almost any setup. If you have questions about how it will work with yours, please send us an email. We offer free consulting to help get your proxy configured.