Experiment Planning

A/B Test Sample Size Calculator

Use this page to plan how much traffic or how many users you need before running an experiment. It is designed for simple two-variant A/B tests where you want to detect a meaningful lift with enough statistical sensitivity.

Share this page

Help others find the right calculator faster

X Facebook LinkedIn Reddit

Reliability

↑

Better Confidence

Bias

↓

Reduce Error

Efficiency

✓

Avoid Oversampling

Calculator for A/B Test Sample Size

Use the calculator below for a quick estimate on this page.

Confidence levelPower (%)Baseline conversion (%)Minimum detectable effect (pp)

Per variant

3,842

Recommended sample for each version in a two-variant test.

Total sample

7,684

Combined traffic across A and B.

Start Your Survey After Calculating Sample Size

Start creating gorgeous surveys

Once you know the number of responses needed, the next step is collecting them. With SurveyLegend, you can create engaging surveys, distribute them across multiple channels, and analyze results in real time.

What drives A/B test sample size

A/B test sample size depends mainly on four choices: your confidence level, your target power, your baseline conversion rate, and the minimum detectable effect you care about.

Smaller expected lifts require more traffic, while larger expected lifts require less. Lower baseline conversion rates also tend to increase the sample needed.

Why this matters

Running an experiment with too little data makes it easier to miss real differences or overreact to random noise. Planning sample size in advance reduces the temptation to stop early based on unstable results.

This page gives you a practical target per variant so you can judge whether a test is realistic before launch.

Estimate traffic needs before launch
Set realistic test durations
Avoid underpowered experiments
Align teams on what counts as a meaningful lift

How to use the result

The per-variant result tells you roughly how many observations each variant should receive. The total sample size is the combined traffic across both variants.

If the result looks too large for your available traffic, the usual next step is to reconsider the minimum detectable effect, not to run the same test with less data.

How to turn the result into a test plan

Once you have a per-variant sample target, compare it with weekly traffic to estimate how long the experiment will need to run. That helps you decide whether the test is realistic before design and engineering work is committed.

Sample size is only one part of experiment quality. Clean tracking, a stable baseline, and a clear stopping rule still matter because a large sample cannot rescue a poorly run test.

Estimate duration from per-variant traffic, not total site traffic
Choose the minimum detectable effect before launch
Keep allocation and tracking stable during the run
Avoid stopping early when results look temporarily promising

Related pages for Calculator for A/B Test Sample Size

Survey Sample Size Calculator Margin of Error Calculator Confidence Interval Calculator Finite Population Correction Calculator A/B Test Sample Size Examples Baseline Conversion Rate in A/B Test Planning How to Calculate A/B Test Sample Size Minimum Detectable Effect Explained Net Promoter Score Calculator

Create your survey now

Build and launch with SurveyLegend

Frequently Asked Questions

What does this A/B test calculator measure?

It estimates how many users or sessions you need per variant and in total based on confidence level, power, baseline conversion rate, and minimum detectable effect.

What is minimum detectable effect?

Minimum detectable effect is the smallest lift you want the test to reliably detect. Smaller effects require larger sample sizes.

Why do power and confidence both matter?

Confidence controls false positives, while power controls false negatives. Both affect how much traffic you need before trusting an experiment result.

How do I turn the sample result into a test duration?

Take the per-variant sample target and compare it with the number of eligible users or sessions each variant receives over time. That gives you a more realistic runtime estimate than looking at total site traffic alone.

Can I run the test anyway if the required sample is too high?

You can, but you should expect a higher chance of inconclusive or misleading results. The better response is usually to revisit the minimum detectable effect, baseline assumptions, or experiment scope before launch.