What the bleep is A/B testing?
Ever obsessed about whether or not your Call-To-Action buttons should be red or blue? Or whether your menu should have 6 navigational options or 7? It’s normal for those of us who care about our business to overthink these details. Obsessive compulsive proclivities aside, there is inherent value in thinking through this minutiae and how it relates to the overall user experience. If we’re running any sort of e-Commerce, the good news is that the end goal of such experimentation is abundantly clear – we want to see an increase in sales.
Standardizing A/B testing can help us to overcome personal biases, as well as to illuminate aspects of user behavior we wouldn’t have otherwise noticed. In other words, when we approach our website designs methodically and without the sometimes distracting cloud of our own aesthetic, we’re more likely to arrive at a design most optimized for conversions.
Okay…let’s try to define A/B Testing
The best way of explaining A/B testing would be to say it’s a This-Or-That method of ferreting out which version of something performs the best for your desired outcome. In Web Design, this desired outcome is typically defined as a conversion of some kind (conversions are not always sales, and can include everything from leads to a certain number of page views and/or a specific level of interaction).
The important thing here is that we begin by defining what our conversion is, or what “success” looks like. Only with a specific goal in mind can we begin to search for a “better” version of our website or marketing campaign.
When should you use A/B Testing?
Whenever you’re trying to make something better and have control over a number of variables related to its design and/or structure, you’ve got a great candidate for A/B Testing on your hands. If you’re a digital marketer you may not have realized it, but you’ve likely already conducted a variety of A/B tests by virtue of pushing ads on platforms like Google and FaceBook.
Both Google and FaceBook use algorithms that at one time or another will favor one ad over another based solely on performance. This is a form of rapid micro A/B Testing people use every day and are oftentimes unaware of. In addition to these algorithms, you have ample opportunity to manually construct similar tests based upon controlled variables such as images, text, and landing pages. In FaceBook you can conduct full on A/B Testing campaigns, but you can accomplish the same results by simply duplicating campaigns and changing one thing in each to see which performs the best. In Google you can replicate this exact process by creating duplicate Ad Groups with slight variations.
The theory of relativity is in full effect…
Although Einstein didn’t have marketing on the brain when he came up with his general and special theories of relativity, in many ways they apply to the digital industry. Understanding the relative nature of data can illuminate a lot of common pitfalls when it comes to data acquisition and interpretation.
Those of us who consider ourselves to be savvy, data-driven marketers are often confronted with the confounding limitations of interpreting a small amount of data. The truth is that the amount of data one has at their disposal is directly related to the amount of conclusions one can draw from it. If you’re looking at a week of data, you may draw drastically different conclusions than you would from a month of data. And you’re almost certainly to get a different picture from a year’s worth of data.
While there is definitely something to be said for acting in the moment to create the best possible ROI, it is this longer term data that allows us to make long term plans and scale our businesses more every year. Without it we’re constantly chasing after short term successes that may or may not dry up in the face of larger trends.
For example, if one was to take the sales data solely from Black Friday and Cyber Monday, they may be led to the conclusion that business is amazing and there is very minimal work to be done in attracting new customers. Interpretations will change if the data set is expanded to include both months of November and December. And it will change ever more once expanded to include an entire year’s worth of user behavior.
This same phenomenon is directly applicable to the number of variables being tested within an A/B framework.
If I wanted to see whether the call-to-action text “Buy Tickets” or “Register Online” got the highest number of people to sign up for an event, and the latter worked best, I might infer that the word “Buy” was off-putting to the intended audience for that event. But this would be a complete shot in the dark, as for all we know “Buy Online” might have gotten a better return on investment simply because people favor the word “Online” more than they favor the word “Buy”, but equally favor the word “Buy” more than the word “Register”.
And of course, it may not make any statistically significant difference at all what verbiage is used, so long as the images for the event connect with the audience. If we’re not careful to control every variable presented to the user, our interpretation of that user’s behavior may be wildly skewed by our own myopic focus.
Every case is different, but over time certain trends will become apparent.
For example, most of the time we find that content always trumps aesthetic. If you have a robust product worth buying, the ability to convey the purpose of that product is far more important than the wrapping paper that product arrives in. Insofar as text and images effectively convey the message of the product, that product will be able to succeed in a dynamic online marketplace. Along these lines, just because changing the color of the call-to-action buttons from blue to yellow produces higher purchase rates, it doesn’t necessarily follow that people prefer the color yellow to the color blue. It could be as simple as that the color yellow contrasts better with the background color of the product page, thus making the call to action text much more noticeable.
Therefore, the ability of a user to easily identify the call-to-action button would be far more important than the color of the button itself. Oftentimes color schemes that we might find less polished will get more clicks. And if we’re conducting responsible A/B Tests, we have to follow the data where it leads.