Growth Systems Library

Holdout Testing

Attribution & Measurement·4 min read·May 2026

Holdout testing is a causal measurement technique that withholds advertising from a defined group — a geographic market, a user segment, or a time period — and compares conversion outcomes against a matched group that received normal advertising. The difference in conversion rates is the incremental lift caused by the advertising.

Definition

Holdout testing is the empirical foundation of incrementality measurement. Unlike attribution models that distribute credit based on touchpoints, holdout tests create a controlled experiment: one group is exposed to the advertising under test, another is not, and the difference in outcomes between the two groups is the causal impact of the advertising. No tracking, modeling, or attribution is required to interpret the result.

The two most common holdout designs are geo holdouts (testing in some geographic markets, pausing in others) and user-level holdouts (using platform tools like Meta Conversion Lift to randomly exclude a percentage of users from seeing ads). Geo holdouts are more independent — they use your own analytics to measure outcomes rather than the platform's. User-level holdouts are more precise but rely on the platform to identify and exclude the holdout group.

Exactius deploys holdout testing as a core practice in the Data layer of the Capital Allocation Loop — running at least one channel-level holdout test per quarter to keep incremental lift estimates current as audiences, creative, and market conditions change.

Why It Matters

Holdout testing is the only measurement method that can produce a causal answer to the question: would this customer have converted without the ad? Every attribution model answers a different question — which touchpoint gets credit — but credit assignment is not the same as causal impact. Holdout testing directly measures the counterfactual.

The typical finding from a first holdout test surprises most brands: 20–50% of conversions attributed to a channel are non-incremental — those customers would have converted anyway. For retargeting campaigns specifically, non-incrementality rates above 50% are common, because retargeting audiences are pre-selected for high purchase intent.

For LTV:CAC measurement, holdout tests are essential for validating that the CAC being measured is real acquisition cost and not partially recycled demand. A CAC built on non-incremental conversions is inflated in the denominator — the business thinks it acquired more new customers than it actually did.

How to Measure

Designing a geo holdout test

1. Select test and control markets with similar baseline conversion rates. Match on demographics, historical conversion volume, and seasonality patterns. 2. Pause or suppress advertising in control markets for the test period (typically 2–4 weeks). 3. Compare conversion rates between test and control markets during the test period, adjusting for any baseline differences. 4. Calculate incremental lift: (test market conversion rate − control market conversion rate) ÷ control market conversion rate.

Statistical requirements

Each cell (test group and control group) needs at least 100 conversions per week to achieve statistical significance at 90% confidence. Test periods shorter than 2 weeks are usually underpowered. Avoid running holdout tests during promotional events, major holidays, or periods of unusual external activity.

The Exactius Take

Holdout testing is non-negotiable in the Exactius methodology. Before any channel is scaled significantly, Exactius runs a holdout to confirm the channel is generating incremental demand at the margin. Scaling spend into a channel with 60% non-incrementality is not growth — it is expensive credit reassignment.

The most common holdout finding Exactius encounters is that retargeting campaigns have very high non-incrementality — often 50–70%. This does not mean retargeting should be paused; it means retargeting spend should be calibrated to the incremental conversion rate, not the total conversion rate. The budget difference is typically substantial.

David Manela's framework treats holdout tests as the empirical anchor of the Capital Allocation Loop. Without holdout data, the loop is making decisions based on attributed performance rather than causal performance — and those two numbers are rarely the same.

Exactius embeds growth squads that run a holdout testing calendar — quarterly per major channel, semi-annually at the portfolio level — so the Capital Allocation Loop always has current incremental lift data to work from.

FAQ

What is a holdout test in marketing?

A holdout test is a controlled experiment that measures the causal impact of advertising by comparing conversion rates between a group that received the advertising and a group that did not. The group that did not receive advertising is the holdout. Because the holdout group is otherwise identical to the exposed group, the difference in conversion rate between the two groups is directly attributable to the advertising — not to pre-existing intent or market conditions. Holdout tests are the most reliable method for measuring true incremental impact, and they are the primary causal measurement tool in the Exactius Capital Allocation Loop.

What is the difference between a holdout test and an A/B test?

An A/B test compares two versions of something — two ad creatives, two landing pages, two audience targets — to determine which performs better. A holdout test compares the presence of advertising against its absence to determine whether the advertising is causing any lift at all. A/B tests optimize within a channel; holdout tests validate whether the channel itself is incremental. Both are necessary: A/B testing without holdout testing optimizes the performance of channels that may not be generating incremental value; holdout testing without A/B testing validates channel impact without finding the most efficient execution within the channel.

How long should a holdout test run?

A holdout test should run for a minimum of 2 weeks, and ideally 4 weeks, to produce statistically reliable results. Tests shorter than 2 weeks are usually underpowered and susceptible to day-of-week variance. Each cell — both the exposed and the holdout group — needs at least 100 conversions per week to reach 90% statistical confidence. For brands with lower conversion volumes, a geo holdout running for 4 weeks across matched markets with combined volume above 100 conversions per week per group is the minimum viable design. Avoid test periods that overlap with major promotions, seasonal peaks, or external market events that could distort the baseline.

Continue reading

Exactius Growth

Your growth system is either compounding or degrading.

Book a diagnostic call. We'll identify where your growth system is breaking.

Book a call

Holdout Testing

Designing a geo holdout test

Statistical requirements

Continue reading

The Capital Allocation Loop

Media Mix Modeling (MMM)

Signal Loss After iOS 14

Incrementality Testing

DTC Attribution

Your growth system is either compounding or degrading.