Google Ads Incrementality Test with Meridian GeoX: Setup Guide
The 9-step runbook for running a Google Ads geo incrementality test with Meridian GeoX. Setup, market selection, iROAS, and MMM calibration.
May 27, 2026
Table of contents
To run a Google Ads incrementality test with Meridian GeoX, pick one campaign, choose a design (holdback, go dark, or heavy up), split your geographies into treatment and control markets, align Google Ads spend with backend revenue by region, run the test long enough to detect real lift, calculate iROAS, then feed the result into Meridian MMM as a Bayesian prior. GeoX is not available to everyone yet. Run the framework now and port it into GeoX when it ships.
Executive Summary
Google previewed Meridian GeoX on May 5, 2026, an open-source, publisher-agnostic geo experimentation tool that feeds Bayesian priors into Meridian MMM. Testing begins later in 2026 and Google hasn't said when it will be available to everyone.
GeoX is not the same as Google's Conversion Lift. Conversion Lift measures Google Ads inside Google. GeoX measures any channel against backend revenue you control. The methodology isn't even new, Google's trimmed_match and matched_markets repos have been on GitHub for years. What's new is the packaging, the name, and the explicit tie to Meridian.
Open source moves the cost from license to labor. Running GeoX in production still requires data pipelines, an analyst, and contamination monitoring. If your measurement team is one performance marketer, a managed platform is cheaper than self-hosting.
Across 225 geo tests on Stella's platform, the median iROAS was 2.31x with 88.4% reaching significance. Brands running tests today will plug into GeoX immediately when it ships. Brands waiting will still be learning the basics in 2027.
The rest of this post is the runbook.
What is Meridian GeoX?
GeoX is Google's open-source tool for running geographic incrementality experiments. It supports three test designs: holdback, go dark, and heavy up. Results convert directly into Bayesian priors that calibrate Meridian MMM. GeoX is publisher-agnostic, so it can measure Meta, TikTok, podcasts, or any channel by geography, not just Google Ads. Testing begins later in 2026.
The mechanic is the same as any geo holdout. Split geographies into a treatment group and a control group, change media exposure in one, and measure the gap. GeoX adds structure around three specific designs:
Holdback: keep ads active in most regions, pause them in a small set of holdout regions
Go dark: pause media entirely in selected regions and measure the drop
Heavy up: increase spend in selected regions and measure the gain
GeoX also runs multiple treatments against a shared control in the same study, which makes multi-cell tests cheaper. The output feeds into Meridian as priors, so the MMM learns from real causal evidence instead of inferring impact from correlation.
A useful frame: Conversion Lift is the platform handing you a lift number. GeoX is the methodology, open and inspectable, that lets anyone produce one.
iROAS distribution across 225 Stella geo tests
DTC advertising incrementality benchmarks, August 2024 to December 2025
Not yet. GeoX entered testing in late 2026 and Google hasn't published a launch date. You can read the docs, look at the underlying trimmed_match and matched_markets repos, and run geo holdouts now using your current setup so you have baseline tests ready when GeoX ships.
The brands that get value from GeoX on day one are the brands already running disciplined geo tests today.
How to run a Google Ads incrementality test with Meridian GeoX
Nine steps. None are optional.
Step 1: Choose the campaign
Pick one campaign worth testing. The campaigns where platform ROAS and iROAS diverge most are usually:
Branded search. Most conversions would have happened anyway. iROAS often lands 60-80% below platform ROAS.
Performance Max. Mixes branded search, shopping, YouTube, and Display, so platform attribution is opaque. See the PMax-specific guide.
YouTube and Demand Gen. Long conversion windows, weak last-click attribution. Most likely to be undercredited.
Non-brand search. Cleanest test for whether the channel acquires new customers.
One test answers one question.
Step 2: Pick the test design
Design
What changes
Best for
Risk
Holdback
Pause ads in a subset of regions
Lowest revenue risk
Smaller signal, needs more geos
Go dark
Pause the channel entirely in selected regions
Cleanest signal
Forfeits revenue during the test
Heavy up
Increase spend in selected regions
Testing whether more spend produces more lift
Requires extra budget
Holdback is the default for most DTC brands. Go dark is right when you suspect a channel is barely incremental. Heavy up is right when you've validated a channel and want to test scaling.
Step 3: Select treatment and control markets
This is the single biggest predictor of significance. Across Stella's 225-test benchmark, the 11.6% of tests that didn't reach significance almost always failed on pre-period matching, not sample size.
Good matching means:
Similar revenue baseline in the 90 days before the test
Similar seasonality over the past 12 months
Similar media spend share across all channels
Similar daily conversion volume
Geographic separation (don't pair Manhattan and Brooklyn)
Most teams use DMAs as the geographic unit. Brands with retail footprints use trade areas. Pick what matches how your business runs.
Step 4: Prepare the data
Before launch, you need this dataset joined and clean:
Daily revenue by geo (from Shopify or your CRM, not Google Ads)
Daily spend by geo across every paid channel
Daily orders by geo
Promo calendar with start and end dates
Inventory outages on revenue-driving SKUs
Major site changes during the test window
Campaign IDs and geographic targeting settings
Treatment vs control assignment per geo
Contribution margin
Critical: outcome data has to come from your backend, not Google Ads. Platform conversions are filtered by Google's attribution model, which is the thing you're testing against. Using them as your outcome variable defeats the experiment.
Step 5: Configure Google Ads location settings
This is where most geo holdouts get contaminated.
Google Ads has two location targeting modes:
"Presence or interest" (the default): shows ads to people in, regularly in, OR showing interest in a location. That last category leaks ads into control regions whenever someone searches "best running shoes Seattle" from a control market.
"Presence": shows ads only to people physically located in the targeted regions.
Search partners and Display Network can override geographic restrictions. Disable both.
PMax campaigns reallocate budget across geographies dynamically. See the PMax guide.
Commuting zones can leak exposure across DMA boundaries. Avoid pairing adjacent markets.
Step 6: Run the test
Brand size or channel
Recommended duration
High-volume DTC ($10M+/year)
2 to 4 weeks
Mid-market ($1M to $10M/year)
4 to 6 weeks
YouTube, Demand Gen, upper funnel
Add 2 weeks post-treatment
Run length should be driven by statistical power, not calendar habit.
Pre-launch QA checklist:
Treatment and control markets are statistically similar in the pre-period
Control regions are fully excluded in campaign settings
Location targeting is set to "Presence"
Other channels are stable (no concurrent test on Meta or TikTok)
Promo changes during the test window are documented
Backend revenue tracking confirmed working by geo
Step 7: Calculate lift and iROAS
Three formulas:
Incremental revenue = Treatment revenue − Expected revenue (what treatment would have earned without the ads)
Expected revenue is where synthetic controls come in. They use a weighted combination of control markets to model what treatment would have done without the campaign. See Stella's guide to weighted synthetic controls.
iROAS = Incremental revenue / Incremental ad spend
iROAS without contribution margin is a vanity number. A 2x iROAS is excellent at 60% margins and unprofitable at 30%. The decision is profit, not the ratio.
Step 8: Feed the result into Meridian MMM
This is what GeoX changes most. The output of a GeoX test becomes a Bayesian prior in Meridian MMM, which means the model learns from your real experiment instead of guessing from correlation.
How calibration changes MMM accuracy
Stella MMM forward-test accuracy, with and without iROAS calibration
Hover each bar for methodology notes. Feeding real experiment results into the MMM as priors improves forward-test accuracy by 8 percentage points.
Source: Stella MMM benchmark. Forward tests measured on holdout periods the model has never seen, across Stella's customer base.
The eight-point gain matters because budget decisions live downstream of the model. An MMM at 87% misallocates a meaningful share of every quarter's spend. An MMM at 95% misallocates less.
Most teams don't have a single causal experiment to calibrate their MMM with. GeoX makes that gap embarrassing instead of invisible. For more, see Getting Started with MMM Using Google Meridian.
Step 9: Make the budget decision
The test should lead to an action.
Result
Action
High platform ROAS + low iROAS
Cut or cap spend. The platform is taking credit for conversions that would have happened anyway.
Low platform ROAS + high iROAS
Scale. This is an undercredited growth channel and you've been under-investing.
High iROAS + high volume
Scale aggressively, then re-test at the higher spend level to check for diminishing returns.
Low iROAS + low volume
Reduce or kill. The channel isn't creating demand at scale.
Inconclusive
Rerun with better market design, longer duration, or a more sensitive KPI.
The worst outcome is running the test, sharing the result, and changing nothing.
GeoX vs Conversion Lift vs manual geo holdout
Google Conversion Lift (Geo)
Manual or Platform Geo Holdout
Meridian GeoX
Who runs it
Google, via a rep
You or your vendor
You, with open-source code
Cost
Free, eligibility-gated
Variable
Free in license
Channels
Google Ads only
Any
Any (publisher-agnostic)
Conversion source
Google's tag, Firebase, DV360
Backend revenue
Whatever you feed it
Methodology
Black box
Vendor-dependent
Open source, auditable
MMM integration
Conversion Lift is fine for "is this Google Ads campaign incremental inside Google." It also has the structural awkwardness of Google grading Google. You wouldn't accept that arrangement from any other publisher.
For everything else, you want the open framework.
What does GeoX actually cost?
Free in license, expensive in labor. To run GeoX in production, a brand needs:
Clean data pipelines joining Google Ads spend, Shopify revenue, and other channels into a single geographic dataset
An analyst who can configure the model, interpret Bayesian posteriors, and translate the result for a CFO
Contamination monitoring, especially around Google Ads location defaults
Maintenance, because the codebase will evolve on GitHub
If a brand has a measurement team, GeoX is great. If "measurement team" is one performance marketer with a Looker license, the math goes the other way. Stella's Incrementality product handles design, contamination checks, and synthetic control matching, and the MMM runs at $3,000 per month flat instead of $15K-$80K consulting fees.
GeoX makes self-hosting cheaper. It doesn't make self-hosting easy.
What Stella has learned from 225 geo incrementality tests
Region selection is the biggest predictor of significance. The 88.4% that reach significance almost always had clean pre-period matching.
Branded search and PMax show the widest platform-ROAS-to-iROAS gap. Both routinely overstate by 60-80% versus true incremental.
Geo contamination is the most common execution failure. Default Google Ads location targeting leaks ads into control markets.
Backend revenue beats platform-reported conversions every time. Platform conversions share the attribution bias the test is designed to measure against.
Tests become more useful when stored as priors. A single test calibrates one quarter. A library of tests calibrates an MMM permanently. This is the GeoX promise made concrete.
Common mistakes to avoid
Using platform-reported conversions as the outcome
Picking markets based on convenience instead of statistical similarity
Letting Google location targeting leak into control markets
Running the test too short
Ignoring contribution margin
Treating inconclusive as failed (inconclusive means redesign, not abandon)
Yes, the code is open source under Google's Meridian project on GitHub. The license costs nothing. The engineering, analyst time, and data infrastructure required to run it in production is where the real cost lives.
Can GeoX measure channels other than Google Ads?
Yes. GeoX is publisher-agnostic. You can test Meta, TikTok, YouTube, podcasts, CTV, or offline media, as long as you can change media exposure by geography and pull a clean outcome signal.
How long should a Google Ads geo holdout run?
Two to four weeks for high-volume DTC brands. Four to six weeks for mid-market. Add a two-week post-treatment window for YouTube or any upper-funnel test. For setup detail, see the Google Ads incrementality guide.
Does GeoX replace MMM?
No. GeoX is the geographic experimentation layer. Meridian is the MMM. GeoX feeds causal evidence into the MMM as priors. Experiments validate causality, MMM allocates budget. You need both.
The bottom line
GeoX is the most important measurement announcement Google has made in years, and the most overrated.
It formalizes a methodology good measurement vendors have run for years, makes the open code easier to use, and gives CFOs a Google-branded reason to trust geographic experimentation.
It doesn't pick your markets, manage contamination, join your backend revenue, or interpret your results. Those are still the hard parts.
The brands that get value from GeoX when it ships are the brands running geo holdouts today.
Run your first Google Ads geo holdout before GeoX ships. Start a 7-day Stella trial and we'll match your markets, run the test, and have results ready before your next budget review.
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.