Forget MDE. Why Experiment Planning Should Start With Time.
đ Welcome back!
Picture the start of your next experiment. You open the test plan, and the very first thing youâre asked to decide is your Minimum Detectable Effect. So you put in a number. 1%, maybe 2%.
Hereâs the part most of us are too nervous to admit: we have no real idea what to put there. Seriously. We donât. We pick something that sounds rigorous, or copy what we used last time, and move on.
Thatâs not a knock on anyone. Itâs structural. The MDE is the number baked into the statistics. Sample size, power and significance all flow from it, so itâs the number every tool and calculator asks for first. We start with the maths, and only afterwards work our way toward the thing we actually have a feel for: runtime. How long will this take? Is that fast enough? Will leadership lose patience before we get a read?
I think weâve got it backwards.
In commercial settings, runtime is the constraint people understand in their bones. So why are we starting anywhere else?
Thatâs the shift I want to walk through today: stop starting with the maths, and start with the time. Itâs a reframe I posted on LinkedIn recently, and it clearly hit a nerve: a lot of debate, some sharp pushback, and a fair bit of âfinally, someone said it.â Iâll bring the best of that thread in below.
The trap hiding inside âwe need a 1% MDEâ
Letâs be honest about where that 1% usually comes from. For most teams, itâs pulled out of thin air as a vague sense of what âgoodâ looks like, a half-remembered benchmark, or a couple of assumptions stacked together. Rarely is there much clarity on where the number came from or what the team will actually do with it.
In practice, âwe need a 1% MDEâ is often just a slow way of saying âweâre not sure what would actually move the needle.â A tiny MDE feels rigorous. But frequently itâs a way of dodging the harder question: what could we build thatâs genuinely worth testing?
And then the maths takes over. Hereâs how it plays out, almost every time:
The team fixes a tiny MDE (~1%).
They run the sample-size and runtime calculation.
The test now needs weeks, sometimes months.
And then things quietly go wrong.
That last step deserves more honesty than it usually gets, because it splits into two failure modes:
The team fudges it. Faced with a runtime no oneâs happy with, they quietly work backwards â nudging the MDE, the baseline, the traffic assumptions â until the maths spits out the runtime they wanted all along. The rigour was theatre.
The team holds the line and pays for it. They accept the long runtime out of principle. But slow learning cycles erode momentum, and stakeholders start to wonder whether experimentation is worth the wait. Unless that long runtime is genuinely warranted (more on when it is below), the cost isnât just time. Itâs trust in the whole approach.
Either way, the MDE you started with did you no favours.
The flip: fix time first
The reframe is simple. Instead of starting with the effect you want to detect, start with the time you can afford to spend.
Pick the longest runtime you can genuinely live with. Maybe itâs two weeks. Maybe your business can stomach six. That number is your Maximum Acceptable Runtime: MAR for short (As far as I know, I made this term up, so if MARs starts taking over your planning meetings⌠you know who to blame and youâre welcome).
Itâs a far more honest constraint, because it reflects how your business actually operates: its appetite for risk, its leadership pressure, its planning rhythm. These are things teams have real intuition about, unlike a 1% MDE.
Then you calculate the MDE from that.
The effect size you get back will usually be bigger than youâd like. Often 5 to 10%+.
That discomfort is the point.
The precise number is largely irrelevant. A big MDE forces a better question. Not âwhat small optimisation can we squeeze out?â but âwhat can we build that will teach us something real and create a signal big enough to detect in the time we have?â
So the next time experiments are dragging on:
Fix time first. Then design for signal.
The best experiments arenât the ones that detect the smallest effects. Theyâre the ones that teach you the most, the fastest.
Why this works in the real world
This isnât just a planning trick. It changes the conversations around experimentation.
It aligns with how businesses actually run. When I shared this thinking, Jake Lambert, Head of Optimisation at Fresh Egg, put it better than I could:
âWe adopted this way of thinking at Fresh Egg and itâs so much more in sync with how businesses work. Aligning to monthly sprints and centering experiment design around that is a much easier way of working compared to experiments running for âtoo longâ and losing internal momentum and engagement.â
Heâs nailing the quiet cost of MDE-first planning: experiments that overstay their welcome bleed internal momentum (no matter how rigorous). A MAR-first lense keeps experimentation in step with the cadence the rest of the business already moves to.
It forces decision-thinking up front. A bigger MDE pushes teams to ask what theyâd actually do with a result, which is the whole game. As Bertil Hatt framed it in the comments, itâs worth picturing a frustrating or borderline result at the moment of design and asking, âwhat would we do at that point?â Fixing time first surfaces those decisions before youâve sunk weeks into a test.
Two honest caveats
First, let me clear up the obvious objection: all the numbers here, like two weeks, is just my example. A MAR is not a universal cap, and itâs certainly not âevery test stops at 14 days.â Jakub Linowski made this point well in the comments: any fixed cap will underpower some experiments and raise false negatives. Heâs right, and itâs not actually a disagreement. Itâs the whole spirit of the idea: the right runtime is the longest one available to you, and thatâs deeply context-dependent. Some businesses have the luxury to invest in long-running tests. Others face such intense leadership pressure that a fortnight already feels generous. Your job is to find your number honestly, not to copy mine.
Second, mind your customer action cycle. Goddy Tams George made the sharpest version of this: if your customers take 45 days to decide, a two-week read isnât detecting your effect, itâs catching an early-funnel reaction dressed up as an outcome. But hereâs how Iâd handle it: this isnât something to re-litigate every experiment. For most teams, the action cycle is a fairly stable property of their context. Work it out once, and bake it into the MAR you set for that kind of test. If your customers genuinely need 45 days to respond, then 45 days (or more) is your floor, and your MAR has to respect it.
The common thread: a MAR makes runtime a deliberate, business-aware decision instead of an accident of whatever tiny MDE you happened to type in first. Choose it honestly, account for your customers, and let genuinely high-stakes tests earn a longer one.
The takeaway
Donât start from the smallest effect you can imagine. Start from the longest runtime you can accept.
Let the MDE fall out of the MAR. A bigger, uncomfortable number is a feature and forces bolder, more decision-worthy tests.
Bend the rule on purpose, not by accident. Stretch your MAR for long customer-action cycles and genuinely high-stakes tests; donât let a reflexive 1% MDE stretch it for you.
If every test had to produce a read in two weeks â or, honestly, probably less these days â how would you build differently?
Hit reply and tell me your teamâs Maximum Acceptable Runtime. Iâd love to know how much it changes what youâd choose to test.
Until next time đ
Some of my other resources you might find useful:
đ§Ž The Experimenterâs Calculator: plan high-quality experiments (and try the MAR-first approach for yourself)
đ Intro to A/B Test Statistics: a free webinar for practitioners







