Blog

Minimum Detectable Effect A Guide for Smart A/B Tests

Master the Minimum Detectable Effect (MDE) in A/B testing. This guide explains how to choose and calculate your MDE to run better, faster experiments.

Content

So, what exactly is the Minimum Detectable Effect (MDE)?

Put simply, it's the smallest change in a key metric—like your conversion rate—that an A/B test is designed to reliably notice. It’s the minimum bar a new variation has to clear for you to confidently say it's a winner.

Think of it as setting the stakes before you even start the experiment. You're defining what a "meaningful" result looks like for your business.

What Is Minimum Detectable Effect and Why It Matters

Image

Here's an analogy I like to use: imagine you're in a crowded, noisy room trying to hear someone. If they whisper, you’d have to listen intently for a long time to be sure you heard them correctly. But if they shout? You'll know what they said almost instantly.

The Minimum Detectable Effect in experimentation works the same way. Setting a low MDE is like listening for a whisper. You're trying to spot a very subtle improvement, which means you'll need a lot more data (a larger sample size) and a longer experiment to be certain.

On the other hand, setting a high MDE is like listening for a shout. You only care about big, obvious wins, so you can spot them much faster and with a lot less traffic.

The Strategic Importance of MDE

Deciding on your MDE isn't just a statistical box to check; it’s a crucial strategic decision that forces you to balance your ambitions with practical reality.

Before you launch any test, you have to ask the most important question: "What is the smallest lift that would actually be valuable enough for us to implement?" This question connects your experiment directly to business impact and resource planning. A tiny 0.5% lift might sound nice, but is it really worth the engineering hours needed to roll it out permanently? Maybe not.

Your MDE is a core part of your experimental design. For any A/B test with standard settings—like a statistical power of 80%—the MDE determines the smallest uplift your test can actually find and declare statistically significant. To learn more about this crucial statistical relationship, check out this excellent glossary entry on the Minimum Detectable Effect by Analytics Toolkit.

By defining your MDE upfront, you turn a vague goal like "improve conversions" into a specific, measurable, and realistic target. It's the guardrail that prevents you from running pointless tests or chasing statistical ghosts.

Ultimately, your choice of MDE has a massive impact on the feasibility of your entire testing program. A well-chosen MDE ensures you're hunting for changes that truly matter while making the most of your traffic and time.

How MDE Impacts Your Experiment's Sample Size

As we've touched on, there's a direct and unavoidable trade-off between the MDE you set and the amount of traffic (sample size) your experiment needs. The smaller the effect you want to detect, the more people you need to show your test to.

This table shows just how quickly sample size requirements can grow as you aim for smaller MDEs.

Desired MDE (Uplift)

Required Sample Size (Approx.)

Experiment Feasibility

10%

Low (e.g., 5,000 users)

High (Quick and easy to run)

5%

Medium (e.g., 20,000 users)

Moderate (Achievable for many sites)

2%

High (e.g., 125,000 users)

Low (Requires significant traffic)

1%

Very High (e.g., 500,000+ users)

Very Low (Only feasible for huge sites)

The numbers make it clear: being ambitious with a low MDE isn't always practical. If you don't have the traffic to support it, your experiment will either run forever or end without giving you a reliable answer. This is why aligning your MDE with your available traffic is one of the first steps in responsible experiment design.

The Four Levers of Experiment Design

Think of designing a successful A/B test like tuning an instrument. You have four main strings to adjust, and tweaking one changes the sound of all the others. Getting them in harmony is the key to running experiments that actually tell you something useful.

It's a delicate balancing act. You can’t just pull on one lever without considering the ripple effect it has on the rest of your setup. These four components work together to define how sensitive, reliable, and practical your test will be.

This infographic does a great job of showing how the Minimum Detectable Effect is tied directly to the other core parts of planning an experiment.

Image

As you can see, MDE isn’t a number you pick out of thin air. It’s the central point connected to the three other pillars of your experimental design.

Understanding Each Lever

Let’s break down what each of these four levers actually does. Getting a feel for how they interact is what separates a basic testing process from a truly strategic one. If you want to see these concepts in action, our guide on mastering A/B split testing provides practical examples to help you get your hands dirty.

  1. Significance Level (α): This is your threshold for being wrong about a "win." It’s your risk of a false positive. A lower significance level (like 1% instead of the common 5%) means you need much stronger proof before declaring a winner. It's your safety net against rolling out a change that doesn't actually help.

  2. Statistical Power (1-β): This is the flip side—it’s the probability that your test will actually spot a real effect if one exists. High power (usually 80% or 90%) reduces your risk of a false negative, which is when you miss out on a genuinely good idea. Just like with significance, getting more power requires more data.

  3. Sample Size (n): This is simply the number of users or sessions you include in your experiment. Think of it as the fuel for your statistical engine. A bigger sample size gives your test more muscle, making it capable of detecting smaller effects with confidence.

  4. Minimum Detectable Effect (MDE): And of course, this is the smallest improvement you’ve decided is worth detecting in the first place.

These four elements are locked together in a mathematical relationship. If you define any three, the fourth is automatically set. For example, once you pick your desired MDE, significance, and power, the sample size you'll need is calculated for you.

This interplay is exactly why planning is so crucial. If you get ambitious and set a tiny MDE without thinking about your website traffic (your sample size), you could end up with a test that’s doomed to fail because it's underpowered or would have to run for six months to get a result.

The real art and science of good experimentation is learning how to balance these four levers to create a test that is both powerful and practical.

How to Choose the Right MDE for Your Business

Image

Alright, let's move from theory to the real world. Choosing your minimum detectable effect isn't just about crunching numbers in a calculator; it's a strategic call. A well-chosen MDE is the sweet spot that makes your experiments both meaningful and actually possible to run.

There are three solid ways to approach setting your MDE. The trick is to blend insights from all three to land on a number that actually works for your goals, your history, and your resources. This is how you turn MDE from an abstract idea into a practical tool.

Align MDE with Business Impact

First things first: what's the smallest win that's actually worth the trouble of implementing? Think about it. Every change you push live costs something—developer time, design work, future maintenance, and the focus you're pulling away from other projects.

Your MDE needs to reflect this "opportunity cost." For instance, let's say a test shows a tiny 0.5% bump in conversions. Is that really enough to justify pulling an engineer off a major feature for a week to ship it? If the answer is a hard no, then your MDE should be higher than 0.5%.

This forces a critical conversation between your marketing, product, and engineering teams. It makes sure everyone is aligned on chasing outcomes that deliver a real return, not just statistically significant results that are practically worthless.

Analyze Your Historical Performance

Your past experiments are a goldmine. Seriously. Go back and look at your previous wins. What was the average effect size you typically saw? This historical data gives you a powerful, realistic baseline for what's achievable with your audience on your product.

If your last five successful A/B tests delivered uplifts somewhere between 4% and 8%, then setting your next MDE at 1% is probably wishful thinking. It would also demand a massive sample size you likely don't need. On the flip side, if your wins are usually small and incremental, aiming for a 20% lift is setting yourself up for failure.

Grounding your MDE in your own history helps you:

  • Set achievable goals: You have hard evidence of what's possible.

  • Manage expectations: Stakeholders get a clear picture of what a typical "win" looks like for your business.

  • Plan better: You can more accurately estimate how long a test will need to run and what resources it will require.

Consider Your Resource Constraints

Finally, you have to get real about your traffic and timeline. This is where the rubber meets the road. Your MDE is directly tied to the sample size you need to reach statistical significance. The lower your MDE, the more users you'll need.

It's time for a reality check. Grab a sample size calculator and play with the numbers. See how your daily traffic, your desired MDE, and the test duration all interact. If setting a 2% MDE means your experiment would have to run for six months, it's just not a practical test.

In that scenario, you might have to accept a higher MDE—say, 5%—just to get an answer within a reasonable timeframe, like two to four weeks. This trade-off is at the heart of experimentation. It's far better to run a faster test designed to detect a larger, more meaningful change than to get stuck in a "zombie experiment" that never ends because the MDE was too ambitious for your traffic.

Calculating Your Sample Size with MDE

So, you've landed on a Minimum Detectable Effect that feels right. The next question is a big one: "How many people do we actually need to run this test?" This is where your MDE stops being a theoretical concept and becomes a critical planning tool, telling you exactly what sample size you need to get a reliable answer.

Getting this right is everything. Skip this step, and you're flying blind. You might run a test for two weeks, only to realize you needed six months of traffic to get a clear result. All that effort for nothing.

From MDE to User Count

Let's make this real with a classic e-commerce example. Imagine you want to test a redesigned "Add to Cart" button, hoping it convinces more people to click.

To figure out your sample size, you need four key ingredients:

  • Baseline Conversion Rate: Looking at your data, you see that 2.5% of visitors currently click the old button. This is your starting point.

  • Minimum Detectable Effect: Your team decides that a 10% relative lift is the smallest win worth celebrating. This means you're trying to see if the new button can push the conversion rate from 2.5% up to 2.75%.

  • Statistical Significance: You stick with the industry standard, aiming for 95% confidence.

  • Statistical Power: You also go with the standard 80% power, which gives you a great shot at spotting the uplift if it actually exists.

Once you have these four numbers, you're ready to calculate your sample size. And no, you don't need a Ph.D. in statistics. Plenty of online tools can do the heavy lifting for you.

For instance, this screenshot from Optimizely's calculator shows exactly how these inputs translate into a required sample size for each version of your button.

Just by plugging in those values, the team now knows precisely how many visitors they need to send to the original design and the new one to confidently detect that 10% change.

The Power of Online Calculators

The best part? You don’t have to dust off a textbook to do the math. Our own free A/B split test sample size calculator is perfect for getting quick, reliable answers.

Using a calculator lets you play around with different MDEs and see the direct impact on your sample size and timeline. This turns experiment planning from a guessing game into a strategic exercise.

Ultimately, the relationship between MDE and sample size is a trade-off you have to manage. Want to detect a tiny, subtle change? You'll need a massive sample size. For many e-commerce sites, trying to find an MDE smaller than a 2-3% improvement can require tens or even hundreds of thousands of users for each variation. That could mean running your test for weeks or months.

This reality check, which you can learn more about in this guide to the MDE-sample size interplay on MIDA.so, is essential for designing tests that are both meaningful and actually possible to complete.

Common MDE Mistakes That Invalidate Your Results

Image

Defining your Minimum Detectable Effect is a huge step toward running disciplined, effective experiments. But even with the best intentions, a few common mistakes can sneak in and completely undermine your results. Getting MDE wrong doesn't just waste traffic; it can lead you straight into making bad business decisions based on faulty data.

Let's walk through the most frequent—and damaging—MDE pitfalls. Once you know what these traps look like, you can sidestep them and ensure your A/B testing program produces trustworthy, valuable insights.

Mistake 1: Setting the Bar Too High (MDE Is Too Large)

Setting an MDE that’s way too high is like using a fishing net with massive holes. You'll only catch the whales, but you'll miss out on all the perfectly good fish that could feed you for weeks. While hunting for huge wins feels exciting, you risk letting a steady stream of smaller, valuable improvements slip right through your fingers.

Imagine you demand a 15% uplift for any new test because you're only interested in "home runs." A test runs and produces a solid 8% lift, but since it doesn't cross your high MDE threshold, the tool reports "no significant result." You might have just ignored a real, profitable improvement that, when compounded over time, could have a major impact.

The fix is simple: re-evaluate the business case for smaller wins. Align your MDE with the actual cost of implementation. Is an 8% lift really not worth the effort? Most of the time, it absolutely is.

Mistake 2: Chasing Ghosts (MDE Is Too Small)

This is the flip side of the coin, and it’s just as dangerous. Setting an MDE that's too small for your traffic volume creates a "zombie experiment"—a test that's doomed to run forever without ever reaching the sample size it needs.

For example, a startup with 5,000 monthly visitors might set a 1% MDE, hoping to detect a tiny change on their homepage. Any sample size calculator will tell them they need hundreds of thousands of users for that. The test will never conclude, tying up resources and delivering zero actionable data.

This mistake often comes from wishful thinking rather than a realistic look at traffic numbers. It's always better to run a valid test for a bigger effect than an invalid one for an impossibly small effect.

Mistake 3: Ignoring MDE and Peeking at the Results

This is probably the most common error of all: not setting an MDE, launching a test, and then stopping it the second the p-value dips below 0.05. This habit, known as "peeking," dramatically increases your chance of getting a false positive. Statistical significance bounces around a lot during a test; a result that looks amazing on day three could just be random noise.

By not calculating your sample size based on a pre-determined MDE, you lose all the statistical discipline that keeps you from acting on chance. This is just one of many common A/B testing mistakes that can invalidate your entire program.

The solution? Always calculate your sample size before the test begins and commit to letting it run for the full duration. Don't call it early just because you see a green arrow. Let the data mature so you can be confident the result is real, not random.

MDE Pitfalls and How to Fix Them

To make it even clearer, here’s a quick-reference table summarizing these common MDE mistakes. Think of it as a cheat sheet to keep your experimentation program on the right track.

Common Mistake

Why It's a Problem

How to Fix It

Setting an MDE that's too large

You miss out on smaller, valuable wins that could have a significant cumulative impact on your business.

Base your MDE on the cost of implementation and the real-world value of a smaller lift. Don't just chase home runs.

Setting an MDE that's too small

The experiment requires an unrealistic sample size, causing it to run forever without reaching statistical significance.

Use a sample size calculator to find an MDE that's realistic for your traffic. Focus on finding a detectable effect.

Ignoring MDE and "peeking"

Stopping a test as soon as it looks significant dramatically increases the risk of a false positive from random statistical noise.

Determine your MDE and sample size before you start. Commit to running the test for its full duration. No peeking!

Avoiding these pitfalls isn't about rigid rules; it's about building a disciplined process. By setting a thoughtful MDE for every experiment, you ensure your results are reliable and your decisions are truly data-driven.

Connecting MDE to Broader Business Decisions

The idea behind the Minimum Detectable Effect is so much bigger than just tweaking a checkout button or testing a new headline. While A/B testing is where most of us in marketing first bump into the concept, MDE is actually the bedrock of smart decision-making anywhere data is used to measure impact. It's a universal tool for anyone who needs to know if a change actually worked.

When you start thinking about MDE this way, your experimentation program transforms. It stops being about simple conversion tweaks and becomes a more disciplined, scientific practice. You're connecting your everyday work to a long and powerful history of research and policy-making that changed the world.

MDE in the Real World

For a moment, forget you're a marketer. Imagine you're a public health official trying to figure out if a new community program actually lowers smoking rates. You can't just look for any tiny dip; you need to know if the program causes a meaningful drop—a change big enough to justify the program's cost and effort. That's your MDE.

You’ll find this same logic playing out everywhere:

  • Education: Does this new teaching method improve student test scores by a margin that really matters?

  • Finance: Is our new investment strategy outperforming the old one by an amount that justifies the risk?

  • Product Development: Will adding this new feature drive engagement up enough to make the development hours worthwhile?

In every scenario, MDE is the critical link between a statistical result and a real-world business decision. It forces everyone involved to get on the same page and define what "success" looks like before a single dollar is spent.

Historically, the concept has been essential in huge global development projects and randomized controlled trials. Organizations like the World Bank, for example, rely heavily on MDE analysis to design project evaluations. It helps them balance tight budgets with the need for scientifically sound results, ensuring they can accurately measure the true impact of their work. You can find further insights on the World Bank's DIME Wiki to see how they apply these principles.

MDE isn't just a technical setting in your A/B testing tool. It's a strategic declaration of what you consider a meaningful change, applicable to any effort where you need to measure cause and effect.

Understanding this wider context helps you see that setting an MDE for your marketing campaign is part of a powerful tradition. It's about ensuring that when you call something a "win," it’s a win that genuinely moves the needle.

Common Questions About MDE, Answered

Once you start digging into the minimum detectable effect, a few practical questions always seem to pop up. Let's tackle some of the most common ones to clear up any confusion and help you navigate the finer points of setting your MDE.

Absolute vs. Relative MDE

You'll often hear people talk about MDE in two different ways: absolute and relative. What’s the real difference?

  • Absolute MDE is a simple, fixed value. Think of it as aiming for a specific number. For example, you might want to see a 1% absolute jump in your conversion rate, hoping to lift it from 4% to 5%. It's direct and easy to understand.

  • Relative MDE is a percentage change based on your starting point. A 10% relative MDE on that same 4% baseline conversion rate means you're looking for a 0.4% absolute increase (taking you from 4% to 4.4%). Most teams use relative MDE because it scales and makes more sense when comparing experiments across metrics with vastly different baselines.

What Do I Do If My Experiment Is Underpowered?

What happens when you realize you just can't get the sample size your MDE calculation calls for?

When your test doesn't have enough data, it’s considered "underpowered." This is a big problem because it means your experiment probably won't be able to spot a real effect, even if your new feature is actually working. You’re flying blind.

If you find yourself in this situation, you have a few choices:

  1. Run the test for longer. The simplest solution is often just to give it more time to collect data.

  2. Increase your MDE. This means you'll only look for a bigger impact, which naturally requires less data to prove.

  3. Go ahead with lower statistical power. You can run the test as is, but you have to accept that there's a much higher risk of missing a genuine win (a false negative).

An underpowered experiment is worse than just being slow—it’s untrustworthy. It's often smarter to go back to the drawing board and aim for a larger, more realistic effect than to run a flawed test that can't give you a clear yes or no.

Can MDE Be a Negative Number?

Is it ever useful to set a negative MDE?

Yes, absolutely. We usually get excited about finding positive improvements, but you can also set your MDE to reliably detect a negative change. This is incredibly useful for "holdback" experiments.

Imagine you're rolling out a big new feature to everyone. A holdback experiment keeps a small group of users on the old version to confirm the new one isn't causing any harm. By setting a negative MDE, you can prove with statistical confidence that your launch didn't hurt key business metrics. It's your safety net.

At Humblytics, we build tools to make sophisticated experimentation feel simple. From figuring out the right MDE to visualizing your funnels and launching tests without writing code, our platform gives you the power to make smarter, data-driven decisions. See how you can grow revenue with confidence.