Last updated

3 May 2023

Reviewed by

What if you had to make a decision that would impact your business revenue, but it was only based on an estimate? You'd want the most accurate estimate possible—or even better, a range of estimates you could calculate with different confidence levels.

Accurate prediction is one of the true tests of statistical data. The ability to derive insight from a small sample size and accurately apply it at a much broader scale brings market research to a whole new level.

When applying the statistical average of a small random sample to a larger population, the level of accuracy will vary depending on many factors. You can use a confidence interval formula to work out how to express the accuracy of the statistical analysis when applied to a larger group.

While uncertainty is a fact of life, it isn't completely random, either. It's even possible to predict, with statistical accuracy, the likelihood of certain events by calculating them in a smaller sample first. Of course, this information will only be accurate to a certain degree when applied to a larger group. The confidence interval expresses the estimate’s accuracy.

The confidence interval formula calculates the likelihood (or confidence) that a certain outcome, expressed as an upper and lower limit, will be true. Specifically, it's the probability that the data from a small random sample accurately reflects similar predictions when applied to a larger sample size.

The final result will be an estimated mean, plus or minus a certain amount, that creates a wider or narrower range of expected values. While such an estimate can't be made with 100% certainty, it can fall within the upper 90th percentile. That value range depends on how much certainty (or confidence) the researcher has in the value range.

More reliable predictions, with higher confidence levels, result in a wider range; a narrower range is more fallible but presents tighter estimates. Depending on the confidence level (or the allowable margin of error), the following elements change:

the predicted value range in a new, often larger, sample

the accuracy of those predicted values

If you have a key decision to make that impacts your revenue, for example, you need to know how likely it is that your estimates are correct. Confidence intervals can tell you this. A well-executed confidence interval formula is useful when you want to make decisions within a certain threshold of certainty.

In business, confidence intervals can accurately predict KPIs and demographic measurements that profits often depend on.

Here are some specific examples of when confidence intervals would come in handy:

Marketers wanting to know how likely the results of a small ad campaign would result in a similar percentage of lead conversions when scaled up.

Revenue teams interested in learning how accurately profits after investing resources in one market segment might translate to the same ROI in all other market segments.

Product designers seeing promising UX statistics in a focus group suggesting a new feature is a hit, but they aren't sure if those results will translate to the wider population.

The confidence interval formula provides researchers with accurate predictions within a specified margin of error. It takes statistical analysis outside the bounds of small, time-limited samples, allowing statisticians to apply known patterns to new, larger populations.

Use the confidence interval formula to take the average mean scores of a random sample and predict how accurately those conclusions can be applied to a larger sample size. There are other variables to be aware of (such as standard deviation), so let’s pick apart and apply the confidence interval formula.

The confidence interval formula takes the sample mean (x̄), then performs a separate addition and subtraction of the product of the confidence level value (z) and the sample standard deviation (s) after being divided by the square root of the sample size (√n):

**CI = x̄ ± z (s / √n)**

**CI = confidence interval**, which will result in an upper and lower value range**x̄ = sample mean**, derived from the original sample**z = confidence level value**, expressed as an "alpha value" (a threshold representing a percentage, written as a decimal—the level of accuracy the final CI must have)**s = sample standard deviation,**a single figure showing the difference between the high and low values of the original sample**n = sample size**of the original sample

**Find the sample mean (x̄).**Average the score of all participants in the original sample. This is the same figure the confidence interval (CI) will use to calculate the probability of achieving the same results in larger samples.**Calculate the standard deviation (s).**A multi-step process, which entails subtracting the sample mean (x̄) from each individual score and squaring each separately**Find the z value for the preferred confidence level.**The confidence level is typically 90% to 99%. The corresponding z value is a decimal figure, e.g. 1.645 for 90%, 2.576 for 99%.**Use these results in the formula**. Place each of these figures into the formula to discover the confidence interval (CI) range.**Interpret your results.**This is the range of figures you can expect with a larger sample size based on the level of certainty set by the confidence level value (z).

Remember that a larger CI range has a higher probability of being true, while a slimmer range carries a lower probability of being true. In other words, there's an inverse relationship between the size of the confidence interval and the confidence level.

If the confidence interval is too wide, it may not be useful. Choose an acceptable margin of error for your purposes.

Using an __Excel template__ (we recommend using the first example, listed on sheet "1-Cl for m”), or __a simpler alternative from WallStreetMojo__, complete the following examples.

Use the test scores 80, 75, 90, 80, 75, 75, 85, 80, 75, and 90 as the sample data. Add these numbers together and divide by 10 to get an average score, or sample mean, of 80.5.

Using this same data set, calculate the standard deviation by subtracting the sample mean from each score, squaring them (separately), adding the results together, then dividing it all by the total number of scores (10). Starting from the beginning, it would be: (80 - 80.5)^2 + (75 – 80.5)^2, and so on, then all divided by 10. The result should be 32.25.

Decide on a confidence level (typically 90% to 99%—we'll use 90%) then plug all these values into the confidence interval formula, as follows: CI = ‾x ± z (s ÷ √n) = 80.5 ±

In plain English, the answer could be stated as: "With 90% certainty, the confidence interval is between 73.3 and 89.7."

Input the ages 20, 25, 30, 35, and 40 into your data set. This results in a mean of 30.

Calculate the standard deviation as described above: [(20 – 30)^2 + (25 – 30)^2 + (30 – 30)^2 + (35 – 30)^2 + (40 – 30)^2] / 5 = 7.906

Choose your confidence interval—we'll select 95%—and run these figures through the formula: CI = x̄ ± z (s ÷ √n) = 30 ± 0.95 (7.906 / √5) = 30 ± 3.36

"With 95% certainty, the confidence interval is between 26.6 and 33.4."

A 95% confidence interval is one of the most commonly used levels. It provides confidence while narrowing the results enough that the upper and lower limits aren't excessively wide.

This is important to ensure the final calculation is specific enough to be useful (as it doesn't tell you much to say you have an extremely wide range of possibilities) yet still falls within a high probability range (an overly specific range is less probable).

As in the final step in the examples, communicating the results in narrative form clarifies the purpose of the confidence interval formula and makes the results meaningful.

We'll break it down into two components:

The range, or upper and lower limits

The certainty, as a percentage, of those figures

By stating "the confidence interval is between X and Y," or "the confidence interval has a lower limit of X and an upper limit of Y," you're saying that if the original data were applied to a larger sample size, the new data would be between X and Y. How can you be sure? Because you've calculated the confidence interval according to a specific probability—the confidence level—which is expressed as a percentage.

So, altogether, you would translate the results of the formula as follows: "With Z% certainty (say, 95%), the data from the original sample will be between X and Y when applied to the total population."

Go from raw data to valuable insights with a flexible research platform

Qualitative vs. quantitative data: what’s the difference?

Last updated: 7 February 2023

Data collection 101: Your complete guide

Last updated: 31 January 2023

Everything you need to know about primary research

Last updated: 28 February 2023

How to do thematic analysis

Last updated: 8 February 2023

Guide to population vs. sample in research

Last updated: 29 May 2023

Non-probability sampling: what it is and how to do it right

Last updated: 14 May 2023

Double-barreled question examples & how to avoid writing them

Last updated: 22 May 2023

What is a population of interest?

Last updated: 22 May 2023

What is causal research design?

Last updated: 14 May 2023

Top 21 must-have digital tools for researchers

Last updated: 12 May 2023

Using narrative analysis in qualitative research

Last updated: 7 March 2023

How to conduct qualitative interviews (tips and best practices)

Last updated: 18 May 2023

Diary study templates

Last updated: 10 April 2023

Guide to population vs. sample in research

Last updated: 29 May 2023

Double-barreled question examples & how to avoid writing them

Last updated: 22 May 2023

What is a population of interest?

Last updated: 22 May 2023

How to conduct qualitative interviews (tips and best practices)

Last updated: 18 May 2023

What is causal research design?

Last updated: 14 May 2023

Non-probability sampling: what it is and how to do it right

Last updated: 14 May 2023

Top 21 must-have digital tools for researchers

Last updated: 12 May 2023

Diary study templates

Last updated: 10 April 2023

Using narrative analysis in qualitative research

Last updated: 7 March 2023

Everything you need to know about primary research

Last updated: 28 February 2023

How to do thematic analysis

Last updated: 8 February 2023

Qualitative vs. quantitative data: what’s the difference?

Last updated: 7 February 2023

Data collection 101: Your complete guide

Last updated: 31 January 2023