It is starting to get popular where developers build static web sites (no underlying back-end) as single page applications with JavaScript frameworks like AngularJS, EmberJS, or BackboneJS. An inherent problem that stems from this implementation is that crawlers (bots) aren’t able to scrape the content as they have JavaScript disabled. These crawlers can include Google, Twitter, Facebook, Google Plus, LinkedIn, and even Slack and HipChat. Optimistically, Scrapers are getting better and better; however, based on what we’ve seen so far — using Prerender.io is by far the most effective solution to date. Additionally, it is the recommended approach by Google when building single page applications rendering dynamic content and having that dynamic content be publicly available.
In the lightning talk, I do a quick rundown on what our challenge was: we had tons of valuable content to our users (players, organizers, brands), such as, tournament information, match data, team rosters, and tournament brackets. Additionally, our web application wasn’t built for “shareability” or “seo-friendliness”. In other applications, typically you would have an underlying technology (NodeJS Express, Ruby on Rails, Django, Play Framework, Spring) serving up meta and graph information so that any crawler that has JavaScript disabled can easily access that content. Essentially, the most important information to make a particular view shareable on different platforms becomes accessible even though you have JavaScript disabled.
After going over our challenges, I talked about how we ran into a product called Prerender.io and described from a high level how the solution worked to make any single page application with dynamic content play nicely with crawlers. Positively, getting the middleware implemented on production was a breeze, getting technical support from Prerender’s team was incredibly responsive and super helpful, as well as, viewing the analytics Prerender’s dashboard was useful in finding out the scale of your platform’s shareable content.
One gotcha we ran into was in our nginx configuration when behind a VPC. In the default nginx config, line 34 has the resolver directive which points to Google’s DNS. There isn’t much documentation on adding multiple IP addresses, so if you are behind a VPC, add the IP address of your internal DNS before Google’s DNS. This solution will ensure that nginx within your VPC can access Prerender’s API when it tries to detect the user agent of the incoming request. This would look something like resolver 1.1.1.1 8.8.8.8;
As a startup, you’re always looking for ways to distribute your product and one of the key things we decided to focus on is making our content on the platform which includes useful information like Tournaments, Match Statistics, Player Profiles, Organizer Information, as well as, Team Rosters SEO-friendly, as well as, more easily accessible on platforms like Facebook, Google Plus, Twitter, and LinkedIn. Prior to our Prerender implementation, most of our Facebook posts didn’t have images and had very generic titles and descriptions — fast forward to now, we’ve surfaced beautiful ways of reaching out to your audience by showing Game Cover Artwork and engaging call-to-actions.
Within three months (September 2014 to November 2014), we were able to achieve the following results