The Most Transparent Incrementality Testing Tool: Stella

What incrementality testing costs in 2026, how each method works, and how to tell a real measurement vendor from a confident dashboard.

Jun 8, 2026
The Most Transparent Incrementality Testing Tool: Stella

Updated June 2026

Table of contents

Quick answer: The right incrementality testing tool depends on your spend, your data, and how much proof you need. Platform lift tests give a fast directional read inside one ad platform. MMM fits always-on budget planning across channels. Geo holdout testing is strongest when you need to prove a channel, campaign, or budget change actually caused revenue. For most DTC brands a geo test runs about 30 to 35 days and needs $15,000 to $30,000 in media to produce a measurable signal. The cheapest option is rarely the safest. Pick the method whose answer you can defend.

Most posts about incrementality tools start with price. That is the wrong place to start.

The better question is whether the number a tool gives you would survive an audit. Can you see the test design? Can you inspect the pre-period fit? Can you see the confidence interval? Can you explain the result to finance without asking them to trust a dashboard?

Below we break down how these tools work, what they cost, which method fits which decision, how to vet a vendor, and when you do not need a full incrementality platform yet.

What is an incrementality testing tool?

An incrementality testing tool measures the revenue your advertising caused above what would have happened anyway. Platform ROAS tells you which ads received credit for revenue. Incrementality tells you whether the ads caused it. Those are not the same thing.

A customer might click a branded search ad after already deciding to buy. A shopper might see a retargeting ad after adding an item to cart. The platform takes credit for both because the ad appeared near the purchase. But proximity is not causation.

Incrementality testing compares what happened with advertising to what likely would have happened without it. Depending on the method, that comparison runs through a geo holdout, a conversion lift test, a media mix model, or another causal approach. The goal is always the same: separate the revenue your ads created from the revenue your reporting simply attributed. If you want the mechanics of the calculation, see our incremental revenue formula and benchmarks.

Across Stella's published benchmark of 225 geo-based incrementality tests run between August 2024 and December 2025, the median campaign returned $2.31 in incremental revenue for every dollar spent, with the middle 50% of tests landing between 1.36x and 3.24x. That dataset is Stella customers, mostly DTC ecommerce brands, so treat it as a benchmark for tested brands, not a universal industry average. Even so, it makes the point: the number that caused the revenue is often very different from the number that got credit for it.

Why is platform attribution not enough?

Platform attribution is useful for daily operations. It helps teams pace spend, compare campaigns, and see what the platforms are optimizing toward. It is not enough for budget truth.

Last-click gives full credit to the final ad. View-through credits an ad someone merely saw. Multi-touch spreads credit across touchpoints. All three help with reporting. None of them prove the ad caused the sale.

That has gotten harder as tracking signals weakened. Apple's App Tracking Transparency makes apps ask permission before tracking users across apps and sites. Safari and Firefox have long limited third-party cookies. Chrome did not ultimately remove them: Google kept its existing cookie controls instead of rolling out a new standalone prompt, and later retired or deprecated many Privacy Sandbox technologies. The direction of the market is still toward less complete user-level tracking.

But the deeper issue is not signal loss. It is logic. Attribution is observational. It sees an ad and a sale close together and assigns credit. Incrementality is causal. It asks what changed when the ad was withheld, reduced, or isolated.

This is not just a marketing opinion. In a Marketing Science study of 15 large field experiments at Facebook, the observational methods behind platform reporting overstated advertising's effect, in half the studies by a factor of three. That is the exact problem incrementality testing exists to solve.

Why do branded search and retargeting often look better than they are?

Branded search and retargeting are the clearest places to see the gap between attributed and incremental revenue. Branded search shows high ROAS because people searching your name already have intent. Retargeting looks efficient because it reaches people who just visited your site.

Those channels can still have value. Branded search can protect against competitors bidding on your name. Retargeting can recover some shoppers who would have left. The point is not that they are worthless. The point is that platform reporting cannot tell you how much of that revenue the ad actually caused.

In Stella's benchmark, Google Search Branded had a median incremental ROAS of 0.70x, meaning the median branded search test returned less than a dollar of new revenue per dollar spent. The same benchmark found platform-reported ROAS often overstated true incrementality by 2x to 3x overall, and by 5x to 10x in some branded search and retargeting cases.

Here is what that looks like in practice. A brand sees 9x reported ROAS on branded search and keeps adding budget. A geo holdout shows the channel is closer to 0.70x incremental. The right move is not automatically to turn off Google. It is to cap branded search, protect the exact terms that matter, and move the freed-up budget into channels that create demand instead of capturing it.

See what your own reported ROAS is likely worth below.

What is your reported ROAS actually worth?
Drag your platform-reported ROAS. The estimate below applies Stella's 225-test benchmark ranges to show how much of that ROAS may be truly incremental. Your actual number requires a geo holdout test.
Your reported ROAS 9.0x
1x 15x
What the platform reports
Benchmark-adjusted estimate
Likely incremental
0.9x – 1.8x
Likely attribution inflation
80% – 90%

How much does incrementality testing cost?

For most DTC brands, a geo-based test needs roughly $15,000 to $30,000 in media to produce a measurable signal. That is not a software fee. It is the media budget needed for the test to detect a real difference between treatment and control regions.

The exact number depends on your baseline revenue, conversion volume, geographic spread, channel mix, and how noisy your data is. Across Stella's 225-test benchmark, the median test budget was $23,825, with a full range from $7,007 to $102,933, and the most reliable zone for many brands sat between $15,000 and $30,000.

Below that, the constraint is usually not dollars. It is statistical power. If there are not enough conversions during the test window, the result disappears into noise. Above it, you are often buying precision without buying more decision value.

So when someone asks for the cheapest incrementality testing tool, cheap is the wrong filter. The cheapest test is not the one with the lowest fee. It is the one that gives you a result you can act on without apologizing for the methodology.

Which incrementality method should you use?

There is no single best method. There is the method that matches the decision you are trying to make.

Which method fits the decision you are making
Situation Best-fit method
You need daily campaign pacing Attribution dashboard
You need a fast read inside one ad platform Platform lift test
You need always-on channel planning MMM
You need to prove whether spend caused revenue Geo holdout test
You need to defend a budget shift to finance Geo holdout test with auditable outputs
You have low spend or low conversion volume Platform lift tests, surveys, and cleaner structure first

The mistake is treating every tool like it answers the same question. Attribution answers "who got credit." MMM answers "how have channels contributed over time." Platform lift answers "what did this platform's own experiment detect." Geo holdout testing answers "what changed when this spend was isolated." A good measurement program knows which question it is asking. If you want the mechanics of running a geo test inside Google, our Meridian GeoX setup guide walks through it step by step.

Are platform lift tests enough?

Sometimes. A platform conversion lift test is useful when you need a quick read on one channel and do not have the budget or data maturity for a broader experiment. It is easier to launch than a geo test and gives directional guidance.

But the platform controls the experiment, the delivery system, and the reporting. That does not make the result useless, but it does mean the test is not fully independent. A platform lift test is usually fine when the decision is tactical. A region-level holdout you control is better when the decision is financial: shifting real budget, cutting a channel, or defending spend to a CFO.

How long does an incrementality test take?

Most geo-based tests run about 30 to 35 days. That is long enough to capture a purchase cycle and detect a real difference, and short enough to avoid dragging into the next quarter. Tests much shorter than that can end before the effect is measurable.

In Stella's benchmark, the median test duration was 33 days, and 88.4% of tests reached statistical significance at 90% confidence. That is not a promise. Significance depends on spend, signal, geography, baseline volume, and pre-period fit, so treat it as what is achievable with good test design.

How do you vet an incrementality vendor?

Ask one question first: can you verify the number?

A trustworthy tool shows more than a final iROAS. It shows how the number was produced and how much uncertainty surrounds it. Any model can be tuned to fit the past, so the harder question is whether it performs on data it did not see during training. That is why out-of-sample accuracy matters more than a beautiful in-sample fit.

What to ask a vendor, and what a weak answer looks like
What to ask Red flag Stronger answer
Model fit Only shows in-sample R-squared Shows out-of-sample performance
Accuracy "Trust the model" Shows MAPE or another inspectable error metric
Uncertainty Gives one exact number Shows confidence intervals
Validation No placebo tests or backtests Shows placebo results or holdout validation
Transparency Dashboard only Exportable outputs and methodology notes
Failure modes Says the model works for everyone Explains when a test will not be valid

A real measurement partner will tell you when not to run a test. That is often the fastest way to tell whether someone is selling measurement or just software.

When do you not need incrementality testing?

You probably do not need a full incrementality platform if your spend or conversion volume is too low to produce a clean signal. As a rough rule, brands spending under about $50,000 a month total should usually start elsewhere, because a geo test may not have the conversion volume to detect a reliable difference between test and control regions.

That does not mean ignoring measurement. It means starting with the basics: cleaner campaign naming, post-purchase surveys, native platform lift tests, and better separation between branded, retargeting, and prospecting spend. Those improve decisions before a full geo test makes sense.

Incrementality testing earns its cost when three things are true. You have enough spend and conversion volume to detect signal. Your channel mix is complex enough that platform attribution is misleading. And the budget decision is important enough that finance needs a defensible answer. When those hold, the cost of not testing is usually higher than the cost of the test.

Why choose Stella?

Do not choose a measurement vendor on price. Choose the one that shows its work.

Stella is built for brands that need to know what actually caused growth, not just what got credit for it. Our team designs the test, runs the analysis, and walks you through the outputs behind the result.

A Stella result does not just say "Meta drove 2.4x iROAS." It shows the test and control regions, the pre-period fit, the expected revenue without the ads, the actual revenue after the test, the confidence interval, and the error rate. That is the difference between a number that looks good on a dashboard and a number your CFO can push on without it falling apart.

There are two ways to work with us. If you have the data and the team to run it, Stella self-serve lets you bring your numbers and run rigorous analysis yourself. If you do not, we run it with you, more like an enterprise data science engagement, without forcing you to build that team in-house. Either way the bar is the same: transparent test design, inspectable outputs, and a result you can defend.

The benchmark figures in this article come from our published study of 225 tests. For customers, the test outputs themselves are auditable: the holdout design, confidence intervals, error rates, and raw results. A result you cannot explain internally will not survive contact with finance.

Frequently asked questions

What is the difference between incrementality testing and attribution?Attribution assigns credit for a sale based on which ads a customer touched. Incrementality testing measures whether the ad caused the sale at all, by comparing what happened with advertising to what likely would have happened without it. Attribution is about credit. Incrementality is about causation.

How much ad spend do you need for an incrementality test?For most DTC brands, a geo-based test needs roughly $15,000 to $30,000 in media to produce a measurable signal. The real constraint is conversion volume, not just dollars. If there are not enough conversions during the test window, the result is too noisy to trust.

How long does an incrementality test take?Most geo-based tests run about 30 to 35 days. That gives the test enough time to observe a purchase cycle and detect lift without slowing decisions for a full quarter.

What is the cheapest incrementality testing tool?Cheap is the wrong filter. A low-cost test that produces a number you cannot defend is not a bargain. Compare tools on test design, confidence intervals, validation, out-of-sample accuracy, and whether you can inspect the outputs.

Is incrementality testing worth it for small brands?Usually not until the brand has enough spend and conversion volume to produce a reliable signal. Smaller brands should start with platform lift tests, post-purchase surveys, cleaner campaign structure, and better separation between branded, retargeting, and prospecting spend.

What is the difference between MMM and incrementality testing?MMM estimates how channels contribute to revenue over time. Incrementality testing isolates whether a specific channel, campaign, or budget change caused additional revenue. Mature programs use both: MMM for always-on planning, geo tests for high-stakes proof.

What type of incrementality tool should I use?Use attribution for daily pacing, platform lift tests for fast directional reads, MMM for always-on budget planning, and geo holdout testing when you need to prove whether spend caused revenue. The best tool depends on the decision you are trying to make.

Can you run incrementality testing without cookies?Yes. Geo holdout tests and MMM do not depend on user-level tracking the way pixel-based attribution does. That is one reason causal measurement has become more important as tracking signals have weakened.

Can incrementality testing measure branded search?Yes, and it is one of the most important channels to test, because it often gets credit for demand that already existed. A test shows how much branded search protects revenue, how much captures demand you already own, and how much budget can be safely reallocated.

What should I ask an incrementality vendor before buying?Ask for the test design, pre-period fit, out-of-sample accuracy, confidence intervals, placebo or backtest results, error rates, and raw outputs you can inspect. Then ask when the model fails. A vendor that cannot explain its failure modes is asking you to trust the dashboard.

Want to know which campaigns are actually causing growth?

Stella will analyze your spend, design the right test, and show the gap between platform-reported ROAS and real incremental revenue. No black box. No vendor math you have to take on faith.

Book a demo