Split-testing a Ruby on Rails application

Published in

Palatinate Tech

12 min readAug 10, 2017

This post is an adaption of a presentation given to the Ruby on Rails Oceania Sydney meetup on 13 June 2017

At Spabreaks.com we have been performing regular split-testing of our code for a number of years and we have learned a lot about our customer preferences along the way. Today I’m going to explain what split-testing is, talk about why we chose to write our own server-side split testing utility rather than using one of the off-the-shelf client-side tools on the market, I’ll provide a walk-through of the custom implementation that we have been using, and finally I’ll present a few examples of some of the things we’ve split-tested and the surprising results that we have received.

What is Split testing?

Split-testing, or A/B testing is the practice of creating two or more versions of a webpage, or some component on your website, serving one version to one set of customers, and another to another set of customers, recording who has seen each one and comparing which is the more successful.

It is an effective technique because you are able to compare different versions under identical conditions, whereas if you completely replace a page and try and compare results over different time periods then you may be introducing other variables that could have a bearing on the results.

It is important that visitors have no way of knowing that they are part of the experiment and just believe they are visiting your standard page, and it is important that the test is conceived with a measurable goal in mind and that you have the appropriate analytical tools in place before the test starts to be able to measure the results.

The reporting platform that we use is Google Analytics and our implementation uses the Experiments API that they provide, but there is no reason that the techniques I’ll describe wouldn’t work with any other kind of reporting setup. The only requirement is that you have a way of associating sessions, as understood by the platform, with split-test variants, as understood by you, and your webserver.

Why did we write our own implementation?

There are a few split-testing tools on the market, and you can set up Experiments directly in the Google Analytics interface, or via their Google Optimise product, and in these cases the platform takes responsibility for deciding which version of your page to show each customer, for rendering the appropriate content, and for feeding that information back to the reporting engine, but none of the techniques we looked at satisfied all of the criteria we drew up.

We specifically wanted an implementation that:

Did not require a redirect between pages

One technique that Google Analytics recommends involves making different versions of your page available under different URLs and it will perform redirects from the original page to the variant page, but this does not provide an optimal user experience.

Was suitable for pages with dynamic URLS

The redirect solution is also only suitable for pages with fixed URLs, whereas we wanted to be able to run experiments on groups of pages such as all of our venue pages, or all of our region pages where the content is rendered dynamically based on a URL parameter.

Did not result in a visible flash when switching between page variants

Another technique recommended in the Analytics documentation involves providing all of your experiment variants as JavaScript functions that modify the page on the client-side, but this can result in the user viewing the original page and then seeing it redrawn when the experiment JavaScript executes, again not ideal

Was suitable for experiments where the content is modified on the server

The client-side solution also restrict the experiments you can run, whereas many useful things that we wanted to test required some kind of modification of the response on the server-side, such as pulling alternative data out of the database.

Is simple to reconfigure for a new experiment

For a long time the perceived overhead of running experiments was the thing that prevented us from doing them at all, so we wanted something that we really simple to reconfigure whenever we planned to deploy a new experiment.

Ultimately we wanted to end up with something like the below where we could tweak a few configuration variables such as a path regex identifying the pages on which the experiment was to be run, and the percentage of customers who would be exposed to it, and it would automatically generate global variables that we could access from anywhere in our controllers and views that would allow us to render alternative content.

So we basically worked backwards from that.

We made the config information accessible by setting the value to a constant called EXPERIMENTS_CONFIG, and we set about designing the utility that would handle our experiment logic.

The Experiments Utility

Now this utility had a number of responsibilities:

To respect our caching infrastructure
To be able to identify sessions that are part of the experiment
To allow us to specify an even or controlled split between page variants
To ensure that a customer sees the same page every time they visit
To provide us a way of serving different content depending on the variant that was chosen
To provide the data back to the reporting engine, in our case Google Analytics

Where should it live?

The first consideration for this new code is where we’re going to keep it, where it’s going to be called from and where it’s going to sit in the lifecycle of a request to our application.

You might think this would logically sit within the rails application and be called as an ActionController before filter for example, but what about requests that are served directly out of the cache and never hit the rails app?

This is absolutely critical to get right, because if your caching layer isn’t conscious of your experiment variants then you lose control over the proportion of customers that see each variant, and it becomes a lottery based on whichever variant was served when the cache was empty.

The Rack Middleware

Our cache layer is provided by a Rack middleware component, so it follows that our experiment code has to be executed earlier in the request lifecycle, so that our cache client has the ability to select a response not just based on the URL of the current request, but also on the variant of that page that has been identified for this customer.

Here is an example of how we would instruct Rack to insert the experiment code immediately before the cache layer.

And that in turn will load a simple rack app called Experiments.

Rack provides an environment hash containing information about the request, which is made available to the rails app, and the thing we’re trying to do here is add our experiment information to that hash.

In the call method we first extract information from the existing environment hash, the path of the current request, and the cookies that have may have been set in previous requests to our domain.

We then instantiate a new class ExperimentManager passing in the config file and the two environment variables, and we call a method on this class called experiment_info, modifying the environment hash with the return value.

Now we’ll look at ExperimentManager in more detail

The Experiment Manager

The ExperimentManager is initialised with 3 bits of data, the config file, the path of the current request and the cookies.

The first thing we have to determine is whether the experiment should be run at all.

This code is called on every single request to the site, but not every page will be relevant to the experiment, and if you’re sending data to Analytics from sessions that don’t include visits to your experiment pages you’re just going to be adding noise to the results.

And the thing we’re checking here is whether the path of the current request matches the path regex stored in the config file

Now that we’ve established that the experiment is to be run we can start building up the information hash that we will provide to the rails app. The first thing we can do is read the experiment_id from the config file, and after that we can provide the variant, and that’s going to be derived by calling a method of the same name

The thing we’re doing here is ensuring that a single customer sees the same variant on every visit and we do that by setting a cookie. We check whether a cookie has already been set, and if so extract the variant id, and if not we assign a new random variant

The variant_from_cookie method is pretty straightforward. We are setting a cookie called cxapi, so it just checks to see if an existing cookie has been set with that name and whose value matches the experiment_id set in our config file. If there is one then we extract the integer value that we know will appear at the end of the string.

But if that method returns nil then we will execute the second part of the expression which is assign_random_variant.

All this method does is generate a number between 0 and 99 and compares it to the percentage specified in the config file, setting either a 1 or a 0 in each case.

So at the end of all that our experiment_info method will return a value looking something like this

And just as a reminder, we’re going to be setting this onto the environment variable that rack will provide to rails here

Modifying experiment content inside Rails

Now that we have added the experiment info to the environment, we need to access it inside rails and make it available to our controllers and views.

We do this by setting a before_filter in application controller that sets the value to an instance variable called @experiment_info. We also set a variable called @variant1 which is a boolean value telling us whether the customer is to see the new version of the page, or the original version.

And now that we have this, producing different versions of a webpage is as simple as checking for the variant number and displaying the corresponding content.

The final thing we have to do is make sure that once we’ve displayed the right content, we tell Google Analytics which variant we used, by loading the Content Experiments JavaScript API, and calling the setChosenVariation function.

We also set a cookie which is a string made up of the experiment_id and the variant, and this will be read the next time the user visits another page on the site

And that is the entirety of our implementation. We’ve had it in place without modification for a couple of years now and we’ve run numerous experiments with it. I’m going to finish up by talking about some of the experiments that we’ve run and the results that we’ve had.

What experiments have we run?

Experiments for risky features

Sometimes we run a test because we’re a little nervous about a new feature and want to test the water on a small proportion of users before releasing it more widely.

We were looking to showcase venues on the homepage that were from the same geographical region as the customer, and we do this by deriving coordinates from the customer’s IP address and uses them to fetch relevant venues from our database.

We knew that this was an element of risk attached to this as geolocation via IP is not always accurate, particularly on mobile networks. In our research we realised that in some cases we were geolocating people from the far north of England to London, and if there’s one thing that’s guaranteed to piss-off a northerner, it’s assuming that they want to go to London!

Luckily the split test narrowly found in favour of displaying the new content, so if there were any grumpy northerners they were in the minority!

Experiments for design changes

Sometimes we run a test because we want to test changes in layout and design. It’s very important to test these things because designers are usually very familiar with the site they are working on, and inclined to make assumptions about how users perceive the site that aren’t always substantiated.

Our design team had revamped our venue page addressing a number of long running concerns we had about the layout of the page. We had positive feedback from all the stakeholders in the business, and from pretty much everyone that we showed the page to, and if you ask me I still prefer the page now.

We ran the split test on 3 occasions, each time adjusting the new page in response to findings from the previous test, but in the end abandoned the new designs altogether because they were consistently outperformed by the existing page across a range of browsers and devices.

Had we gone with our intuition we would have ended up costing the company thousands of pounds in lost revenue.

Experiments for political reasons

Sometimes we run split tests to maintain cordial relations with other departments!

The marketing team at Spabreaks were keen to hit customers with a popup window a few seconds after they’d landed on the site to get them to sign up to our email newsletter. Our feeling was that this was more likely to just antagonise customers rather than persuade them to do business with us, but we could see the argument from the marketing team who were keen to keep in touch with less motivated visitors who we might be able to attract over the long term.

We were hoping that the test would support our objection to it, but it actually ended up alerting us to a hard to detect UX bug.

Click throughs from the homepage were relatively unaffected which was surprising to us, but we noticed that people that saw the popup were less likely to make a booking once they got to the venue page.

When we investigated we realised that there was a interaction issue between our booking calendar, which also displayed in a popup-modal window, and the newsletter popup. If a customer already had the booking calendar open when the newsletter popup displayed, clicking to close the newsletter popup took the mouse focus away from the booking calendar, and meant that the customer could no longer scroll the page to reach the book button.

The compromise we reached was that we would continue to show the newsletter popup on the homepage, but we would remove it from all other pages.

We sadly weren’t able to prove that the popup was hindering the customer journey from the homepage, but it shows the incidental benefit of analysing features of your website in isolation.

Experiments for refactored code

And finally sometimes we run tests purely to verify code that we have rewritten without necessarily changing the customer facing behaviour.

All the major transactions on our site take place with at least some interaction with a calendar, and our old implementation consisted of rails partials that were loaded by AJAX every time the customer clicked to switch months, and it was clunky and slow to operate and the codebase was getting extremely complicated to maintain to the point that we were reluctant to make changes to it. We rewrote all of our calendars in React which was a significant investment in time, and made the codebase a lot nicer, but for such a fundamental change we wanted to be sure that we hadn’t broken anything.

And a good job we did, because we had!

When we ran the split test the first time, the results were disappointingly in favour of the original implementation, which didn’t seem right so we investigated, and discovered that the conversion rate had dropped down to 0% for certain Internet Explorer browsers because of some incompatible javascript that we hadn’t provided a polyfill for.

When we fixed the issue and ran the split test for a second time, and the results evened out, so we were happy to deploy our new changes.

We did so with a confidence that we wouldn’t have had if we had just gone ahead and deployed, and that is the most important thing to take home from this, that by adopting a regular program of split-testing our code we have gained a higher level of confidence in our own code and our decision making, and I’d encourage anyone to do the same.