News Release 7-Jan-2025

Market researchers and online advertisers, are A-B tests leading you astray? A new study says they could be

News from the Journal of Marketing

American Marketing Association

Researchers from Southern Methodist University and University of Michigan published a new Journal of Marketing study that examines platforms’ A-B testing of online ads and uncovers significant limitations that can create misleading conclusions about ad performance.

The study, forthcoming in the Journal of Marketing, is titled “Where A-B Testing Goes Wrong: How Divergent Delivery Affects What Online Experiments Cannot (and Can) Tell You About How Customers Respond to Advertising” and is authored by Michael Braun and Eric M. Schwartz.

Consider a landscaping company whose designs focus on native plants and water conservation. The company creates two advertisements: one focused on sustainability (ad A) and another on aesthetics (ad B). As platforms personalize the ads that different users receive, ads A and B will be delivered to groups with diverging mixes. Users interested in outdoor activities may see the sustainability ad whereas users interested in home decor may see the aesthetics ad. Targeting ads to specific consumers is a major part of the value that platforms offer to advertisers because it aims to place the “right” ads in front of the “right” users.

In this new study, researchers Braun and Schwartz find that online A-B testing in digital advertising may not be delivering the reliable insights marketers expect. Their research uncovers significant limitations in the experimentation tools provided by online advertising platforms, potentially creating misleading conclusions about ad performance.

The Issue with “Divergent Delivery”

The study highlights a phenomenon called “divergent delivery” where the algorithms used by online advertising platforms like Meta and Google target different types of users with different ad content. The problem arises when the algorithm sends different ads to distinct mixes of users using A-B testing: an experiment designed to compare the effectiveness of the two ads. Braun explains that “The winning ad may have performed better simply because the algorithm showed it to users who were more prone to respond to the ad than the users who saw the other ad. The same ad could appear to perform better or worse depending on the mix of users who see it rather than on the creative content of the ad itself.”

For an advertiser, especially with a large audience to choose from and a limited budget, targeting provides plenty of value. So large companies like Google and Meta use algorithms that allocate ads to specific users. On these platforms, advertisers bid for the right to show ads to users in an audience. However, the winner of an auction for the right to place an ad on a particular user’s screen is not based on the monetary value of the bids alone, but also the ad content and user-ad relevance. The precise inputs and methods that determine the relevance of ads to users, how relevance influences auction results, and, thus, which users are targeted with each ad are proprietary to particular platforms and are not observable to advertisers. It is not precisely known how the algorithms determine relevance for types of users and it may not even be able to be enumerated or reproduced by the platforms themselves.

The study’s findings have profound implications for marketers who rely on A-B testing of their online ads to inform their marketing strategies. “Because of low cost and seemingly scientific appeal, marketers use these online ad tests to develop strategies even beyond just deciding what ad to include in the next campaign. So, when platforms are not clear that these experiments are not truly randomized, it gives marketers a false sense of security about their data-driven decisions,” says Schwartz.

A Fundamental Problem with Online Advertising

The researchers argue that this issue is not just a technical flaw in this tool, but a fundamental characteristic of how the online advertising business operates. The platform’s primary goal is to maximize ad performance, not to provide experimental results for marketers. Therefore, these platforms have little incentive to let advertisers untangle the effect of ad content from the effect of their proprietary targeting algorithms. Marketers are left in a difficult position in that they must either accept the confounded results from these tests or invest in more complex and costly methods to truly understand the impact of creative elements in their ads.

The study makes its case using simulation, statistical analysis, and a demonstration of divergent delivery from an actual A-B test run in the field. It challenges the common belief that results from A-B tests that compare multiple ads provide the same ability to draw causal conclusions as do randomized experiments. Marketers should be aware that the differences in effects of ads A and B that are reported by these platforms may not fully capture the true impact of their ads. By recognizing these limitations, marketers can make more informed decisions and avoid the pitfalls of misinterpreting data from these tests.

Full article and author contact information available at: https://doi.org/10.1177/00222429241275886

About the Journal of Marketing

The Journal of Marketing develops and disseminates knowledge about real-world marketing questions useful to scholars, educators, managers, policy makers, consumers, and other societal stakeholders around the world. Published by the American Marketing Association since its founding in 1936, JM has played a significant role in shaping the content and boundaries of the marketing discipline. Shrihari (Hari) Sridhar (Joe Foster ’56 Chair in Business Leadership, Professor of Marketing at Mays Business School, Texas A&M University) serves as the current Editor in Chief. https://www.ama.org/jm

About the American Marketing Association (AMA)

As the leading global professional marketing association, the AMA is the essential community for marketers. From students and practitioners to executives and academics, we aim to elevate the profession, deepen knowledge, and make a lasting impact. The AMA is home to five premier scholarly journals including: Journal of Marketing, Journal of Marketing Research, Journal of Public Policy and Marketing, Journal of International Marketing, and Journal of Interactive Marketing. Our industry-leading training events and conferences define future forward practices, while our professional development and PCM® professional certification advance knowledge. With 70 chapters and a presence on 350 college campuses across North America, the AMA fosters a vibrant community of marketers. The association’s philanthropic arm, the AMA’s Foundation, is inspiring a more diverse industry and ensuring marketing research impacts public good.

AMA views marketing as the activity, set of institutions, and processes for creating, communicating, delivering, and exchanging offerings that have value for customers, clients, partners, and society at large. You can learn more about AMA’s learning programs and certifications, conferences and events, and scholarly journals at AMA.org.

Journal

Journal of Marketing

DOI

10.1177/0022242924127588

Article Title

Where A-B Testing Goes Wrong: How Divergent Delivery Affects What Online Experiments Cannot (and Can) Tell You about How Customers Respond to Advertising

Article Publication Date

7-Aug-2024

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.