How AI Is Reshaping the Future of Experimentation Programs
It's not about speed. It's about scope.
👋 Welcome back!
There’s an anxiety creeping through experimentation teams right now. Nobody’s saying it out loud, but I hear it between the lines in almost every conversation I have with experimentation leaders:
If engineering can ship this fast, does anyone still need us?
AI-assisted development is accelerating how quickly teams can build and deploy product changes. What used to take a sprint takes a day. What used to take a quarter is becoming a month. And when shipping gets that fast, the instinct is to wonder whether experimentation — with its sample sizes, its run times, its careful measurement — becomes the bottleneck everyone routes around.
I think that instinct is wrong. But not for the reason you might expect.
So, as someone watching the shift that AI is creating play out in real time, I wanted to share my thoughts here on how our industry needs to evolve over the next few years.
Learning doesn’t lose value. It changes shape.
The core promise of experimentation has never been about slowing things down. It’s about learning: understanding what works, what doesn’t, and why. That value doesn’t disappear when shipping gets faster. If anything, it increases.
But here’s what does change: what you’re learning from, what you’re learning about, and why you’re learning it.
When a team can spin up 50 new ideas in a week instead of five, the volume of decisions that need evidence behind them multiplies. The question isn’t whether to learn. It’s whether your program can keep up with the pace of decisions that now need to be made.
And for most teams, it can’t. But not because experimentation is slow.
Because ownership is too narrow.
The real bottleneck isn’t speed. It’s scope.
Here’s the pattern I see: a team owns a specific feature or surface. They ship a change, set up an experiment, and then… wait. They need data. They need time for signal to accumulate. You can’t manufacture true evidence: that has a velocity cap no amount of AI can remove.
So the team is stuck. They either:
Wait: killing the speed advantage they just gained.
Skip the experiment: shipping fast but learning nothing.
Peek early: making decisions on noise.
None of these are good. And the root cause isn’t that experimentation takes time. It’s that the team has nowhere else to go while they wait.
It reminds me of the old elevator problem. When tenants complained their building’s elevators were too slow, the solution wasn’t faster motors. It was installing mirrors next to the doors. People stopped noticing the wait because they had something to look at. The real issue was never the speed of the elevator. It was the emptiness of the wait.
Experimentation has the same problem. You can’t make learning faster. But you can make sure teams aren’t standing around staring at the doors.
This is the narrow-ownership trap. When a team’s world is one page, one feature, one surface, there’s no room to rotate. Speed without scope creates a queue, not a learning engine.
The rotation model
Now picture a different setup.
A team owns customer acquisition as an outcome, not a specific button or page. Their territory spans value perception, onboarding flows, pricing clarity, and payment experience.
With AI-assisted engineering, they spin up 50–100 ideas focused on the onboarding funnel. Those experiments go into the program: designed, prioritised, launched. While they’re running and collecting signal, the team shifts focus to the payment flow. They build there, spin up the next wave of experiments, and send those off too.
By the time the onboarding results come back, the team has been productive the entire time. They haven’t waited. They haven’t skipped measurement. They’ve rotated.
This is what I think high-performing experimentation looks like in an AI-accelerated world: teams managing many concurrent threads across a wide enough ownership space that they can build in one area while learning in another.
Not slower. Not faster without guardrails. Just structured differently.
Why this runs against the grain
For the past decade, the trend in product organisations, especially at large companies, has been to narrow ownership. Smaller teams. More specific surfaces. Tighter scope.
In theory, this creates clarity and accountability. In practice, I’ve seen it create something else entirely.
I worked in a tech company where teams ended up with ownership scoped to very specific features. Worse, they were KPI’d on adoption and usage of those features. The result? Teams pushing their own things, boosting their own metrics — often at the cost of cannibalising others — without moving the needle on the outcomes that actually mattered, like acquisition or retention.
That model was designed for a world where shipping was expensive and slow. When building a feature was a quarter-long investment, it made sense to give a team singular focus on it.
In a world where building is fast and cheap, that logic breaks down.
Ownership will expand
In my view, the micro-team, single-feature ownership model is ending.
Not because it was wrong. It was right for its era. But the constraint it was designed around (shipping is expensive) is dissolving. And the constraint that remains (learning is slow) demands a different structure.
Teams of the future will own broader territories. They’ll be aligned to higher-level outcomes like acquisition, retention, and activation rather than individual surfaces. The number of surfaces and options they experiment across will increase, because they’ll have the building capacity to explore more of the space.
This maps naturally to what’s often called the “empowered product team” model, where teams are accountable for outcomes, not output. That concept has existed for a while. What changes now is the scale at which it needs to operate.
An empowered product team with AI-assisted engineering and a wide ownership scope can generate ideas, build, and experiment at a rate that would have been unimaginable two years ago. But only if the experimentation program underneath them can support it.
What needs to change in experimentation programs
This is where things get honest.
Most experimentation programs and the tools that support them were built for a world of sequential experiments with narrow scope. One team, at most a few tests and a few decisions at a time.
The rotation model demands something different. Teams switching in and out of problem spaces need to be able to pick up where they left off. They need memory: a clear record of what they’ve tested, what they’ve learned, and where the compounding insights are forming.
Without that, the rotation model collapses into something much worse: lots of threads, no continuity. Teams forget what they’ve done. Learnings evaporate between rotations. The idea backlog becomes a graveyard of disconnected, low-impact tests instead of a structured engine for compounding knowledge.
And then there’s the operational side. The things that are painful today like figuring out statistical power, selecting the right test design, sequencing experiments in parallel, making clear decisions from ambiguous results. All of these become more painful when volume increases. If these aren’t effectively on autopilot, the program becomes a bottleneck regardless of how the teams are structured.
In short, the future needs two things from experimentation programs:
Program-level tracking: the ability to manage structured programs of work across multiple problem spaces, so teams can switch context without losing the thread.
Operational autopilot: the statistical and design mechanics that currently require specialist effort need to be automated, so that increased experiment volume doesn’t require proportionally increased human overhead.
Neither of these exist well today. And until they do, even the best-structured teams will struggle to realise the full potential of AI-accelerated shipping.
The bottom line
AI isn’t going to kill experimentation. But it is going to expose the structural limitations of how most teams are set up to learn.
The teams that thrive will be the ones who recognise that the bottleneck was never speed. It was scope. They’ll widen their ownership, rotate across problem spaces, and invest in the program-level infrastructure that makes compounding learning possible at a pace that matches their building capacity.
Speed is being solved. Learning isn’t. Not yet. In my view, the organisations that figure out how to learn as fast as they can ship will be the ones that pull ahead. This is the problem I’m focussed on solving.
This is a prediction, not a report from the field. I haven’t seen this fully play out yet. But the logic feels inevitable to me: code velocity is accelerating past a point where narrow ownership and manual experimentation workflows can keep up. Something has to give, and I think it’s the ownership model.
I wrote this piece because I genuinely believe this is where things are headed. But I’m one person with one lens.
I want to hear your perspective. Do you see the same shift coming? Are you already feeling the tension between shipping speed and learning speed? Or do you think I’ve got this wrong?
Reply to this email or, better yet, leave a comment on the post so we can get a real discussion going.
Until next time 🙌
Simon linkedin.com/in/drsimonj
Some of my resources you might find useful:
🧮 The Experimenter’s Calculator: a tool to plan high-quality experiments
📊 Intro to A/B Test Statistics: a free webinar for practitioners





Hi Simon! Thanks for sharing your thoughts on this, its highly appreciated.
I agree with the general view, but I think most of what you are "seeing" now it was also true before AI. The cost of "building" things has been trending downwards for many years now. So the "bottleneck" was always on the "soft" side of things: Management, research, prioritization, business context, etc.
With AI that "cost" now became even less very rapidly, which makes the constrains in all the other areas to become more obvious. And on some of those AI could also help, like summarizing big volumes of research data, or speeding up reportings and dashboards and the likes for results presentations and so on.
The part I dont agree on is this:
Operational autopilot: the statistical and design mechanics that currently require specialist effort need to be automated, so that increased experiment volume doesn’t require proportionally increased human overhead.
Proper research, prioritization given business and customer context, and statistical design are all high editorial tasks, where AI usually falls short. Can it help making it faster? Sure, but automated? Right now at least I dont think so or even if it could, cant see why someone would want that.
What I can see happening in the short term is that with the cost of building becoming MUCH lower thanks to AI, exploratory "tests" becomes much more lucrative. Approaches like "2 Stage" one that we researched on, or any explore - exploit method is has probably better payoffs now, since for each new Hypothesis you can now maybe come up with 10 or 20 "treatments" and pick up real potential candidates for a subsequent confirmatory test much sooner.
So before some hypothesis maybe required 2, 3, 4 “confirmatory” iterations with "bigger" samples until getting a consistent or good enough winner, you can now do a “2 stage test” in the same or even less time.
This is an interesting perspective—seems very much like a potentially positive way for things to play out.
There does seem like there's a *little* bit of an echo of the early days of GUI-driven testing platforms (read: Optimizely) when there was a frenetic pace of "just do stuff and test everything". It took years to start to get past that mentality so that there would be some measure of deliberation before testing something. Question #1: do you see any risk of this? Basically, "We're going to be waiting for test results, so we need to just kick up the volume of 'things' we're testing (across former siloes) so that, once we get a bunch rolling, we'll have a steady stream of results flowing in." Basically, it's so fast to build that thoughtful progress goes out the window.
Question #2: do you see this envisioned future having any impact on the various challenges of tests that overlap? I feel like a case could be made that this could get worse, but maybe it's actually no different than the current (and, with properly designed tooling and processes, it's not an issue?).