The complex nature of running multivariate tests

How multivariate tests render impractical insights and what other solutions you can run for more optimal results.

VP of Global Marketing, Dynamic Yield

Often believed to function as the be-all-end-all for optimizing numerous combinations of website elements, there’s more than meets the eye when it comes to multivariate testing. In this post, we’ll address how the complex nature of running multivariate tests renders them impractical and why smarter A/B testing is, in fact, the best possible solution for results.

If you want to build a truly exceptional website — one that both catches the attention of casual visitors and quickly turns their curiosity into CTA clicks and conversions — you’re going to need more than just an eye-catching design. You need to test everything, from the background colors to the placement of your forms, to see which elements deliver the best results.

You might even be tempted to test all of these elements at once. Why bother A/B testing each individual element when you can just run one complex, be-all-end-all multivariate test for everything, right?

Not so fast.

I’ve done a lot of multivariate testing over the years, and the results have never — ever — been worth the effort.

What exactly is multivariate testing, and what’s so bad about it? To answer those questions, we need to start with a brief recap of A/B testing.

A/B testing is exactly what it sounds like. There’s an “A” version of a page, which has one design or content element, and a “B” version that has a variation on that same element. By testing these pages against each other, you can easily see which one performs best. A/B testing allows you to quickly and easily test these variations. Does the site generate more conversions with a light blue background, or with a medium gray one? The more individual elements you test — button color, headline placement, images — the more information you have to create the best-performing site.

Multivariate testing, on the other hand, works by testing the performance of every possible combination of the elements you want to test, and at the same time. After running a multivariate test, you have a ton of data about which combinations perform the best. It seems like this would be a great shortcut, but in practice, it’s usually a waste of time.

One of the very first experiments I ran involved a deep dive into multivariate testing and the problems were obvious from the start. I was using Google’s Website Optimizer — an early version of their Experiments tool — and it was more than powerful enough to handle any test I could dream up. Why not start with a multivariate one?

I had three different page layouts, with three different sets of colors, and three unique headlines. As multivariate tests go, this was simple stuff. All I had to do was launch the test, and let Website Optimizer examine the site’s traffic to give me an estimate of how long the test would take to complete.

Very quickly, I had my answer: at current traffic levels, it would take a little over 53 years to complete.

At first, I thought I’d made a mistake, but I hadn’t. The site I was working with just didn’t have enough traffic. If I wanted to see meaningful results, I’d either need a few million more website visits, or I’d need a less complex test.

Since that humbling first experience, I’ve done a lot of work with multivariate testing. I’ve used it to test highly complex sites with incredible traffic numbers, and on smaller projects where multivariate tests took a painfully long time to complete. With the exception of a few rare situations, multivariate testing just isn’t worth it.

In fact, I know of five big reasons you should avoid multivariate testing whenever possible.

Reasons to avoid multivariate tests

It’s true that multivariate testing is a highly sophisticated tool. A single multivariate test can answer dozens of questions, all at once, while an A/B test can only answer one question at a time. But just because a test is complex doesn’t mean that it’s better, or that the data generated is more useful.

Multivariate tests have five huge problems in practical use. Let’s take a look at them.

1. They require tons of traffic

Multivariate testing requires a massive amount of input data, which means that it only works well when the site already has a ton of traffic. How much traffic? It depends on the test, but in my experience, it’s always an absurdly high number of visits. The more complex the test, the more visitors — and time — it requires to generate results.

2. They’re tricky to set up

One of the things that makes A/B testing so appealing is that the tests themselves are simple to set up. After all, you’re only changing one element and keeping the rest of the design the same. Anyone with a working knowledge of web design can set up simple A/B tests, and even complex tests rarely require more than a few moments of a developer or web designer’s time.

By contrast, multivariate tests often require somewhat complex coding to work. Even creating a basic multivariate test is a lot of work, and it’s all-too-easy for them to go off the rails. A tiny mistake in the test design might not become obvious until it has been running for weeks, or even months. If you don’t have a ton of experience with testing — running a variety of different kinds of tests on many different websites — then you shouldn’t even consider a multivariate test.

3. Hidden opportunity costs

When you’re testing a website, time becomes one of your most valuable commodities. Multivariate tests are slow to set up and slower to run. All that lost time adds up, creating serious opportunity costs.

During the time that it takes you to see meaningful results from a single multivariate test, you could run dozens of A/B tests. Each of those A/B tests will quickly provide you with a definitive answer to a specific question, giving you information you can act on immediately.

4. They don’t allow you to move quickly and fail fast

If there’s any time to “move fast and break things” as you optimize your website, it’s during the testing phase. This approach allows you quickly learn what works and what doesn’t, try crazy ideas, and even fail spectacularly without any real risk. This approach can be extremely effective, but it’s completely incompatible with multivariate testing.

The iterative nature of A/B tests is a serious advantage. Each A/B test gives you a result — even if that result is inconclusive — creating a series of way points you can refer back to as you hone your site’s design. With a multivariate test, you don’t have that trail of breadcrumbs to follow. Worse, multivariate tests are too slow and tedious to really justify taking those risks in the first place.

5. Multivariate tests are biased towards design

Another important thing to remember is that multivariate testing often provides answers to questions that actually aren’t all that important. Some of the strongest proponents of multivariate testing that I’ve met are UX and UI designers. These are people who have read about the successes that big players like Google and Amazon have had from multivariate testing of tiny, tiny details, like which shade of blue generates the best-performing page links. For a company operating on Google’s scale, a 0.01% boost to clicked links from a slightly darker blue is significant, but for most websites, that’s not very helpful information.

Design is important, but it’s not everything. UI and UX elements represent a fraction of the variables you can use to test the performance of a website. You can just as easily test variations in the website copy, promotional offers, and even site functionality. These are elements that often go overlooked in multivariate testing, even though they can have tremendous impacts on conversion rates and other essential metrics.

A/B testing’s limitations and possible solutions that don’t require multivariate tests

One of the major selling points of multivariate testing is that it can solve problems that typical A/B testing can’t. In reality, however, most of these claims are either greatly exaggerated or flat-out wrong. With very few exceptions, a well-planned series of A/B tests is just as powerful — not to mention faster — than any multivariate test.

Let’s tackle these claims one by one.

1. A/B tests can’t handle multiple variants

One of the biggest myths about multivariate testing is that it’s the only practical way to test multiple elements at once. That’s simply not true. It’s entirely possible to create a more complex A/B-type test — an A/B/C or A/B/C/D test — to weigh multiple variables. These are much slower to run than typical A/B tests, but they’re considerably faster than even a simple multivariate test.

In reality, overlapping A/B tests are just as effective. Something that has always surprised me is the idea that running overlapping A/B tests on the same page will somehow generate invalid results. Statistically speaking, that’s not really the case. If done correctly, concurrent A/B tests are just as valid as any other A/B test.

Consider a high-traffic website that wants to test two different offers. One offer is for a free demo of their product, while the other is for a free 30-day trial. At the same time, the company also wants to test an orange CTA button against a red one. From a statistical point of view, these are easy things to adjust for.

2. Complex A/B tests are too complex to be practical

Another claim I’ve run into more than once is the idea that complex A/B tests are somehow more difficult to create, run, and analyze than multivariate tests. True, complex A/B tests can require a little more in the way of planning, as they often build on the previous results, but that doesn’t mean that any individual test is mind-bogglingly complicated. That’s not something I can say for even a typical multivariate test.

Unfortunately, most of the people tasked with creating and running these tests are marketers, and most marketers don’t have a background in statistical analysis. It’s difficult to design and interpret these sophisticated tests if you don’t understand the methodology. This can make multivariate testing seem like the easier choice, even though it’s often the wrong tool for the job.

3. The perfect test fallacy

This brings me to my biggest problem with multivariate testing: It’s an inherently lazy solution. There’s this idea that all you need to do is set up the perfect test, and then just forget about it until the results come back with the perfect answer. I had that same idea when I ran my first Google’s Website Optimizer test. I quickly learned that the real world is just far way too messy of a place for this approach to actually work.

Only after investing significant effort in creating these multivariate tests, and considerable time running them, will you even know if you’re asking the right questions. Make a mistake or leave out a variable, and you’re back to square one. That’s a lot of work to go through to find the right font size for a header.

By comparison, A/B testing is easy and fast. It requires a disciplined and active approach, however, creating clearly defined test conditions to answer specific questions. Each A/B test delivers an answer — even if that answer is inconclusive — adding to growing pool of data. This makes refining future tests that much easier, focusing on specific segments in a way that would be completely impractical to attempt with multivariate testing.

I’m not claiming multivariate website testing is a fundamentally bad idea. There are situations — particularly for big companies like Google and Amazon — where it might even be the smartest and most cost-effective solution. But for everyone else, it’s often the worst possible way to optimize the results for a website.