Forget MDE. Why Experiment Planning Should Start With Time.

Jun 26, 2026

👋 Welcome back!

Picture the start of your next experiment. You open the test plan, and the very first thing you’re asked to decide is your Minimum Detectable Effect. So you put in a number. 1%, maybe 2%.

Here’s the part most of us are too nervous to admit: we have no real idea what to put there. Seriously. We don’t. We pick something that sounds rigorous, or copy what we used last time, and move on.

That’s not a knock on anyone. It’s structural. The MDE is the number baked into the statistics. Sample size, power and significance all flow from it, so it’s the number every tool and calculator asks for first. We start with the maths, and only afterwards work our way toward the thing we actually have a feel for: runtime. How long will this take? Is that fast enough? Will leadership lose patience before we get a read?

I think we’ve got it backwards.

In commercial settings, runtime is the constraint people understand in their bones. So why are we starting anywhere else?

That’s the shift I want to walk through today: stop starting with the maths, and start with the time. It’s a reframe I posted on LinkedIn recently, and it clearly hit a nerve: a lot of debate, some sharp pushback, and a fair bit of “finally, someone said it.” I’ll bring the best of that thread in below.

The trap hiding inside “we need a 1% MDE”

Let’s be honest about where that 1% usually comes from. For most teams, it’s pulled out of thin air as a vague sense of what “good” looks like, a half-remembered benchmark, or a couple of assumptions stacked together. Rarely is there much clarity on where the number came from or what the team will actually do with it.

In practice, “we need a 1% MDE” is often just a slow way of saying “we’re not sure what would actually move the needle.” A tiny MDE feels rigorous. But frequently it’s a way of dodging the harder question: what could we build that’s genuinely worth testing?

And then the maths takes over. Here’s how it plays out, almost every time:

The team fixes a tiny MDE (~1%).
They run the sample-size and runtime calculation.
The test now needs weeks, sometimes months.
And then things quietly go wrong.

That last step deserves more honesty than it usually gets, because it splits into two failure modes:

The team fudges it. Faced with a runtime no one’s happy with, they quietly work backwards — nudging the MDE, the baseline, the traffic assumptions — until the maths spits out the runtime they wanted all along. The rigour was theatre.
The team holds the line and pays for it. They accept the long runtime out of principle. But slow learning cycles erode momentum, and stakeholders start to wonder whether experimentation is worth the wait. Unless that long runtime is genuinely warranted (more on when it is below), the cost isn’t just time. It’s trust in the whole approach.

Either way, the MDE you started with did you no favours.

The flip: fix time first

The reframe is simple. Instead of starting with the effect you want to detect, start with the time you can afford to spend.

Pick the longest runtime you can genuinely live with. Maybe it’s two weeks. Maybe your business can stomach six. That number is your Maximum Acceptable Runtime: MAR for short (As far as I know, I made this term up, so if MARs starts taking over your planning meetings… you know who to blame and you’re welcome).

It’s a far more honest constraint, because it reflects how your business actually operates: its appetite for risk, its leadership pressure, its planning rhythm. These are things teams have real intuition about, unlike a 1% MDE.

Then you calculate the MDE from that.

The effect size you get back will usually be bigger than you’d like. Often 5 to 10%+.

That discomfort is the point.

The precise number is largely irrelevant. A big MDE forces a better question. Not “what small optimisation can we squeeze out?” but “what can we build that will teach us something real and create a signal big enough to detect in the time we have?”

So the next time experiments are dragging on:

Fix time first. Then design for signal.

The best experiments aren’t the ones that detect the smallest effects. They’re the ones that teach you the most, the fastest.

Why this works in the real world

This isn’t just a planning trick. It changes the conversations around experimentation.

It aligns with how businesses actually run. When I shared this thinking, Jake Lambert, Head of Optimisation at Fresh Egg, put it better than I could:

“We adopted this way of thinking at Fresh Egg and it’s so much more in sync with how businesses work. Aligning to monthly sprints and centering experiment design around that is a much easier way of working compared to experiments running for ‘too long’ and losing internal momentum and engagement.”

He’s nailing the quiet cost of MDE-first planning: experiments that overstay their welcome bleed internal momentum (no matter how rigorous). A MAR-first lense keeps experimentation in step with the cadence the rest of the business already moves to.

It forces decision-thinking up front. A bigger MDE pushes teams to ask what they’d actually do with a result, which is the whole game. As Bertil Hatt framed it in the comments, it’s worth picturing a frustrating or borderline result at the moment of design and asking, “what would we do at that point?” Fixing time first surfaces those decisions before you’ve sunk weeks into a test.

Two honest caveats

First, let me clear up the obvious objection: all the numbers here, like two weeks, is just my example. A MAR is not a universal cap, and it’s certainly not “every test stops at 14 days.” Jakub Linowski made this point well in the comments: any fixed cap will underpower some experiments and raise false negatives. He’s right, and it’s not actually a disagreement. It’s the whole spirit of the idea: the right runtime is the longest one available to you, and that’s deeply context-dependent. Some businesses have the luxury to invest in long-running tests. Others face such intense leadership pressure that a fortnight already feels generous. Your job is to find your number honestly, not to copy mine.

Second, mind your customer action cycle. Goddy Tams George made the sharpest version of this: if your customers take 45 days to decide, a two-week read isn’t detecting your effect, it’s catching an early-funnel reaction dressed up as an outcome. But here’s how I’d handle it: this isn’t something to re-litigate every experiment. For most teams, the action cycle is a fairly stable property of their context. Work it out once, and bake it into the MAR you set for that kind of test. If your customers genuinely need 45 days to respond, then 45 days (or more) is your floor, and your MAR has to respect it.

The common thread: a MAR makes runtime a deliberate, business-aware decision instead of an accident of whatever tiny MDE you happened to type in first. Choose it honestly, account for your customers, and let genuinely high-stakes tests earn a longer one.

The takeaway

Don’t start from the smallest effect you can imagine. Start from the longest runtime you can accept.
Let the MDE fall out of the MAR. A bigger, uncomfortable number is a feature and forces bolder, more decision-worthy tests.
Bend the rule on purpose, not by accident. Stretch your MAR for long customer-action cycles and genuinely high-stakes tests; don’t let a reflexive 1% MDE stretch it for you.

If every test had to produce a read in two weeks — or, honestly, probably less these days — how would you build differently?

Hit reply and tell me your team’s Maximum Acceptable Runtime. I’d love to know how much it changes what you’d choose to test.

Until next time 🙌

Simon
linkedin.com/in/drsimonj

Some of my other resources you might find useful:
🧮 The Experimenter’s Calculator: plan high-quality experiments (and try the MAR-first approach for yourself)
📊 Intro to A/B Test Statistics: a free webinar for practitioners

The Experimenter’s Advantage

Discussion about this post

Ready for more?