An incrementality testing platform is software that answers the only question that matters for profitable growth
Last updated: September 2025
Every marketer has lived this moment. Facebook reports a 4x ROAS. Google Analytics shows conversions climbing. Your dashboard is green. Success feels certain.
Then the uncomfortable realization creeps in. Many of those customers were already planning to buy. They bookmarked the site last week. They searched your brand name yesterday. The ad captured credit, but it did not create the sale. This is the central flaw in attribution. It measures correlation, not causation. It is very good at counting what happened and very poor at estimating what would have happened anyway.
Industry observers estimate that a very large share of ad budgets go to conversions that would have occurred without the spend. The antidote is incrementality testing. It is not a better way to attribute. It is a different question entirely. It asks what your marketing truly caused.
Key idea to hold on to: Attribution tells you what happened. Incrementality tells you what you caused to happen.
An incrementality testing platform is software that answers the only question that matters for profitable growth:
"Would this sale have happened without the ad?"
Rather than crediting clicks or views, the platform runs controlled experiments. It compares markets that receive exposure to markets that do not, then calculates the additional revenue and orders caused by the campaign. This is the incremental lift. It is the cleanest way to separate signal from noise.
The goal is not to replace attribution. The goal is to ground decisions in causal truth when real money is on the line.
The concept is simple. The execution requires rigor. Here is the flow used by serious teams.
Most tests are won or lost here. If your test and control regions do not naturally behave the same way before the test, no model will rescue the results afterward.
What strong location selection looks like:
A practical example:
If markets do not move together in the past, differences during the test are not credible evidence of lift. The integrity of your test begins with correlation, not with modeling.
Pick the design that answers your question with the least noise.
Start with inverse holdouts on your biggest channel. You will learn more in two to four weeks than months of dashboard debate.
This is where platforms diverge, and the trade-offs are not trivial.
The most expensive part of incrementality is a bad test, not the software. Human control with a clear checklist often yields cleaner experiments.
Once the test and any cooldown window complete, the platform needs clean, daily, region-level data.
Core inputs:
How other platforms handle this:
Many connect directly to ad accounts and automate data pulls. This reduces manual work but is a major driver of price. Maintaining deep integrations and taking on account-side responsibility often pushes the monthly cost into the eight to twelve thousand dollar range. It also creates a failure mode. A junior buyer launches a campaign without exclusions and the test is compromised before the data even enters the system.
How Stella handles this by design:
Stella uses a clean Google Sheet or CSV upload. Data can come from Shopify, your analytics platform, or ad exports. This keeps the product affordable, preserves flexibility across any channel, and keeps operational control with your team. For most brands that run a few high quality studies per year, the return on heavy integrations is limited. For all brands, the integrity of the test is paramount.
What happens after upload in Stella:
Different models can produce different answers on the same dataset. Hiding that variance with a one-model approach creates false certainty. Showing multiple candidates and selecting based on fit creates trustworthy outcomes.
A credible report is both statistical and practical.
Expect to see:
Many teams add a short post-treatment window. For a geo holdout you turn ads back off. For an inverse holdout you turn ads back on. For a scale test you return spend to baseline. This is a useful validity check. If results normalize as expected, your causal story is stronger.
Incrementality testing is not just a method. It is a decision engine across common marketing questions.
Each use case helps a team shift from vanity metrics to business outcomes.
Teams with technical talent often consider building their own workflow. There are open-source options such as GeoLift in R and CausalPy in Python. These are valuable tools. They are also easy to misuse.
Where DIY usually struggles:
If your team runs one complex study per year and has the in-house expertise, DIY can be a learning exercise. If your team needs repeatable decisions and guardrails, a dedicated platform is usually safer and cheaper in total cost.
The honest comparison is simple. Competitors bring strong integrations and hands-on service. Stella brings scientific rigor, multiple models, and guided decision support at a fraction of the cost.
Competitors do good work for enterprise teams that need white glove execution. Most marketers do not need that overhead. Stella delivers accuracy and action at a price that makes repeat testing feasible.
Your first test may show lower iROAS than attribution suggests. That is not a failure. That is informative truth.
Attribution is useful for operations and daily reporting. It is not designed to answer the causal question that drives profitable growth. Incrementality testing does exactly that. It does not make your marketing look better. It makes your decisions better.
Measured and Haus helped the market recognize that causation matters. Stella advances the practice with rigorous location selection, multiple modeling approaches, AI guidance that translates iROAS into budget moves, and a cost structure that allows teams to test more often.
If you want to stop arguing with dashboards and start reallocating with confidence, begin with one clean inverse holdout on your largest channel. The clarity you gain will change how you plan budgets for the rest of the year.
Run your first study free today. No credit card required.
Try the free virtual demo of Stella right here, right now.
How much does incrementality testing cost
Platform pricing ranges from about one thousand to three thousand dollars per month for self service tools and up to twenty five thousand dollars per month for enterprise solutions with deep integrations. The larger cost is the test budget itself. Plan to allocate ten to twenty percent of a channel’s spend during the test window.
How long should a test run
Plan for at least three weeks. Four to six weeks is common when cycles are longer or effects are subtle. Scale tests that estimate response curves often benefit from six to eight weeks.
Can I run multiple tests at the same time
Generally no. Overlapping tests on interacting channels create interference that clouds the read. Test one major channel at a time and allow a short washout between studies.
What if my incrementality results contradict attribution
Expect this. Attribution shows correlation while incrementality estimates causal lift. When they disagree, the gap is often organic demand that attribution counted as paid impact.
Do I need a data scientist to run these tests
A basic understanding of fit metrics and confidence intervals helps. That said, modern platforms provide strong guardrails. Stella in particular runs multiple models and flags the most trustworthy result with clear explanations.
What data do I need to upload into Stella
Date, region, revenue, orders, and the tested channel’s spend at a daily cadence. Optional columns for confounders such as other channel spend, promotions, or out of stocks improve accuracy. A clean Google Sheet or CSV is all you need.
Why does Stella use manual uploads instead of integrations
Two reasons. Cost and control. Heavy integrations raise price and introduce account side liability. Manual upload keeps Stella affordable and keeps your team in control. For most brands that run a few high quality studies per year, that is the most efficient trade-off.
What is a good confidence interval
Tighter is better. If you see iROAS of 3.2x with a range from 2.8x to 3.6x, you can act with confidence. If the range is 0.8x to 5.6x, extend the test or improve market matching before moving budget.
How do I handle seasonality
Avoid tests that straddle major holidays or promotions. Use longer pre-periods to capture recurring seasonal patterns. Models like BSTS can incorporate seasonal structure, but clean scheduling is still your first line of defense.
What is the minimum spend that makes sense
As a rule of thumb, monthly channel spend above fifty thousand dollars produces cleaner reads. Below that threshold the test can still work, but you may need a longer window to reach significance.
What is the simplest way to get started
Pick your largest channel. Run an inverse holdout for three to four weeks. Use Stella to select markets, validate fit, and upload the data. Review iROAS with its confidence interval. Translate the result into expected profit impact. Plan one follow up test based on what you learned.