by Jennifer Lonoff Schiff

11 A/B testing mistakes and how to avoid making them

How-To
Apr 06, 20157 mins
CRM SystemsE-commerce SoftwareInternet

Poorly done A/B testing will not only yield inaccurate data on what your customers prefer, but it may actually cause you to lose customers. Here are 11 classic mistakes and ways you can avoid them.

abtesting5 thinkstock
Credit: Thinkstock

Most companies (if not all) have at some point conducted an A/B test, testing two versions of something related to content, design or price to help them improve their landing pages and conversion rates. However, not all A/B tests are created, or conducted, equally.

Indeed, design your A/B test poorly and not only will you not get an accurate assessment of what your customers prefer, but you may actually wind up losing customers.

So what can companies do to ensure their A/B tests are well-designed and yield positive results? Following is a list of the 11 most common (and serious) A/B testing mistakes and what your organizations can do to avoid them.

Mistake No. 1: Testing too many elements, or variables, at a time.

“One of the top A/B testing mistakes is having too many test variables,” says Corinne Sklar, CMO, Bluewolf, a global business consulting firm. “For example, having split subject lines and different calls to action in the email body makes it impossible to determine which of the two was the success factor in driving leads.” To avoid this problem, “keep your testing to one variable at a time.” That way, she explains, you will “gain a better understanding of what content strategy is working most with your intended audience.”

Mistake No. 2: Testing something that’s obvious or has already been proven to be more effective.

“There’s no reason to test something as simple as ‘Dear Customer’ vs. ‘Dear First Name,’” says Joshua Jones, managing partner, StrategyWise, a global provider of data analytics and business intelligence solutions. “This has already been tested time and again and the results are clear: customization [or personalization] is better. No reason to waste time and resources on the basics when you can test more critical elements, such as price elasticity or feature preferences.”

Mistake No. 3: Testing something insignificant, that’s hard to quantify.

“Test apples against oranges first,” says Justin Talerico, founder & CEO, ion interactive, an interactive content software and services provider. “Find the big winner. Then iterate with smaller variations — but never so small that the lift isn’t worth the effort. Testing is justified with big wins.”

Mistake No. 4: Testing something you can’t actually deliver.

“A/B tests can give you eye opening results with lots of potential,” says Jess Jenkins, digital analyst, LYONSCG, an ecommerce digital agency. “But those insights are useless if you can’t act on what you’ve learned,” she points out. So “make sure you are testing items that are actionable. For example, video content might be your ticket to better engagement but do you have the resources and plan to [do this]?” Before you test something, “ensure that you can deliver what you learn from your tests.”

Mistake No. 5: Testing the wrong thing, or making (false) assumptions.

Sometimes, companies assume something is a problem when it isn’t – or test the wrong thing. “For example, an ecommerce manager may test a $50 vs. a $100 threshold for free shipping, but [he] may never test whether or not the current customers even value free shipping at all,” says Dan Hutmacher, senior digital consultant, LYONSCG. Or they might “test the color of a button without considering its size, shape or location.”

So before conducting an A/B test, think about what it is you really want to test.

Mistake No. 6: Running an A/B test at different times.

“If both your A and B variant are not tested at the same time, you’ve invalided your test,” says Troy O’Bryan, CEO, Response Capture, a demand generation agency. “The same audience at a different time may have a different opinion. Segment your audience into two groups and run the A/B variant simultaneously.”

Mistake No. 7: Your landing pages are flawed (invalidate the test).

“The biggest mistake that we see in A vs. B tests is that there will not be proper consistency between the [landing page] copy and [the] landing page design to provide a true apple vs. apple test,” says Jason Parks, owner, The Media Captain, a digital marketing company.

“If you want to test advertising copy, the landing pages need to be identical [in design] to determine which advertising copy performs better based on conversions,” he says. “If you want to test landing page design, [then] the ad copy has to be identical. If [the pages and tests are] not set up properly, your results will be skewed.”

“For example, when A/B testing our homepage at Qeryz, our goal was to increase freemium signups,” says Sean Si, the founder & CEO of SEO Hacker and Qeryz, a survey tool for websites. “The problem is, we weren’t able to fully qualify the signups because the original version of the homepage didn’t have a signup form in it while the variation version did. The goal count was set for people who went to our signup page and then entered onboarding,” he says.

“This was a major mistake because people who were seeing the variation version did not need to go to our signup page because there was a signup form right there in the homepage! Because of the mistake in goal setting, the data in that A/B test was severely skewed,” he says. “We had to redo it and it took another two weeks for data gathering, which sucked.”

Mistake No. 8: Using too small a sample size.

“The single most important factor that determines the power of a given test is the sample size,” says Jonas Dahl, business analyst for Adobe Target, a personalization solution. “To avoid underpowering your test, consider [that] a typical standard for a well-powered test includes a confidence level of 95 percent and a statistical power of 80 percent.

To calculate sample size, Dahl suggests using a sample size calculator. “The calculator helps ensure the test has enough statistical power (ability to detect lift). It can also prescribe the right sample size,” he says. “If a sample size is too small, then you may not be able to detect a substantial lift. The great thing about the calculator is that it can decide the minimum sample size.” 

Mistake No. 9: Not using segmentation.

“By looking at the results for all users, you’re probably missing some more nuanced behavior changes happening for a narrower segment (e.g., first-time users vs. returning users),” says Dexter Zhuang, product manager, Growth Marketing, CreativeLive, which provides free online classes. To avoid this problem, “make a clear and very specific outline of your experiment’s goals and be sure you’re not experimenting with the wrong cohorts within your A/B test,” he says. One way to “segment your audience [is] by how many times they have completed X action before (purchase, enroll, watched live, or whatever other behavioral metrics you track).”

Mistake No. 10: Ending tests prematurely.

“It is tempting to stop a test if one of the offers performs much better or worse than the others in the first few days of the test,” says Dahl. “However, as the test collects more data points, the conversion rates converge toward their true long-term values and the positive or negative lift can easily change,” he says. “The best way to avoid these issues is to determine an adequate number of visitors before running the test. Then let the test run until this number of visitors has been exposed to the offers.”

Mistake No. 11: Reporting results before the test is complete.

“Early results are almost always misleading by wide margins one way or the other,” says Tom Kuhr, chief product officer at AgentAce.com, a real estate startup that connects homebuyers and sellers to realtors. So “it’s best to keep your mouth shut until the test is complete. Otherwise you might be eating crow.”