From a Single Guess to a Range of Plausible Values
When you use a sample to learn about a population, you usually want to estimate an unknown population value (a parameter) such as a population mean or a population proportion. A point estimate is a single best guess from your sample (for example, the sample mean or sample proportion). A confidence interval adds context: it gives a range of plausible values for the parameter, based on how much sampling variation you expect.
Think of it this way: a point estimate answers “What is my best guess?” A confidence interval answers “How wrong could my guess reasonably be, given the sample size and variability?”
Point estimates you will use most often
- Population mean (unknown): estimated by the sample mean
x̄. - Population proportion (unknown): estimated by the sample proportion
p̂.
What a Confidence Interval Is (and Is Not)
What it is
A confidence interval (CI) is constructed from your data using a rule that is designed to capture the true parameter a certain percentage of the time in the long run.
A generic CI has the form:
estimate ± (critical value) × (standard error)Where:
- Listen to the audio with the screen off.
- Earn a certificate upon completion.
- Over 5000 courses for you to explore!
Download the app
- estimate is
x̄orp̂. - standard error (SE) measures typical sampling variation of the estimate.
- critical value depends on the chosen confidence level (e.g., 90%, 95%, 99%).
What it is not
- Not “a 95% probability the parameter is in this interval” for your one fixed dataset. After you compute the interval, the parameter is either in it or not.
- Not a guarantee that the interval contains the truth.
- Not a statement about individual outcomes (it is about a population parameter).
The Long-Run Interpretation: Many Intervals from Repeated Samples
The most reliable way to understand confidence is to imagine repeating the same sampling process many times, each time building a CI using the same method.
Suppose you build 100 separate 95% confidence intervals from 100 independent samples of the same size from the same population:
- About 95 of those intervals will contain the true parameter.
- About 5 will miss it (sometimes by a little, sometimes by a lot).
Visually, you can picture a “stack” of intervals. Most cross the true value (a vertical reference line), and a few do not. The confidence level is about the procedure’s hit rate, not about uncertainty in a single finished interval.
Why Confidence Intervals Change: Sample Size, Variability, Confidence Level
1) Sample size: larger samples give narrower intervals
Standard errors shrink as sample size grows, because averaging (or aggregating) more information reduces random fluctuation.
- For a mean, the SE is roughly proportional to
1/√n. - For a proportion, the SE is also proportional to
1/√n(with an additional dependence onp).
Practical implication: to cut the margin of error in half, you typically need about four times the sample size.
2) Variability: more spread means wider intervals
If the underlying measurements vary a lot from person to person (or unit to unit), the sample mean will bounce around more from sample to sample. That larger bounce shows up as a larger SE, which widens the CI.
For proportions, variability is highest near 50% and lower near 0% or 100%. That’s why estimating a proportion near 0.5 tends to require larger samples to achieve the same margin of error.
3) Confidence level: more confidence means wider intervals
Higher confidence requires a larger critical value, which multiplies the SE and increases the margin of error.
- A 99% CI is wider than a 95% CI from the same data.
- A 90% CI is narrower than a 95% CI from the same data.
This is a trade-off: higher confidence means you cast a wider net.
Confidence Intervals for a Population Mean (Intuitive Level)
Goal: estimate an unknown population mean (average) using a sample mean x̄.
Core idea
Your sample mean is unlikely to equal the population mean exactly. A CI uses the estimated typical error of x̄ to create a plausible range for the true mean.
A common form is:
x̄ ± t* × (s / √n)Where s is the sample standard deviation and t* is a critical value (often from a t distribution) chosen to match the confidence level.
Step-by-step: building a CI for a mean
Compute the point estimate: calculate
x̄.Measure sample variability: compute
s.Compute the standard error:
SE = s/√n.Choose a confidence level (e.g., 95%) and obtain the corresponding
t*for your sample size.Compute margin of error:
ME = t* × SE.Form the interval:
(x̄ − ME, x̄ + ME).
Example (mean): average delivery time
A company samples n = 64 deliveries and finds an average delivery time of x̄ = 42.0 minutes with sample standard deviation s = 12.0 minutes. For an intuitive 95% CI, use a critical value near 2 (exact t* depends slightly on n).
SE = 12 / √64 = 12 / 8 = 1.5ME ≈ 2 × 1.5 = 3.0- CI ≈
42.0 ± 3.0→(39.0, 45.0)
Interpretation: using this method, the data support a plausible range for the population mean delivery time of about 39 to 45 minutes.
Confidence Intervals for a Population Proportion (Intuitive Level)
Goal: estimate an unknown population proportion (a rate, percentage, or probability) using a sample proportion p̂.
Core idea
If you sample n individuals and observe a “success” in x of them, then p̂ = x/n. A CI accounts for the fact that p̂ varies from sample to sample.
A common approximate form is:
p̂ ± z* × √(p̂(1 − p̂)/n)Where z* is a critical value from the standard normal distribution (about 1.64 for 90%, 1.96 for 95%, 2.58 for 99%).
Note: In practice, some methods perform better than the simple formula above, especially for small samples or proportions near 0 or 1. The meaning and interpretation of the CI remain the same.
Step-by-step: building a CI for a proportion
Count successes: record
xout ofn.Compute the point estimate:
p̂ = x/n.Compute the standard error:
SE = √(p̂(1 − p̂)/n).Choose a confidence level and corresponding
z*.Compute margin of error:
ME = z* × SE.Form the interval:
(p̂ − ME, p̂ + ME), usually reported as percentages.
Example (proportion): app conversion rate
An app team samples n = 400 visitors and observes x = 92 sign-ups. Then p̂ = 92/400 = 0.23. For a 95% CI, use z* ≈ 1.96.
SE = √(0.23 × 0.77 / 400) ≈ √(0.1771 / 400) ≈ √0.0004428 ≈ 0.0210ME ≈ 1.96 × 0.0210 ≈ 0.041- CI ≈
0.23 ± 0.041→(0.189, 0.271)
Interpretation: the data support a plausible range of about 18.9% to 27.1% for the population conversion rate, using this 95% method.
Reading a Confidence Interval Like a Decision-Maker
Center: what value is most supported?
The interval is centered around the point estimate. If the interval is (39, 45), values near 42 are most consistent with the sample.
Width: how precise is the estimate?
Two studies can have the same point estimate but different precision. A narrow CI suggests high precision; a wide CI suggests substantial uncertainty.
Compare to a benchmark
Often you care whether the parameter is above/below a target:
- Mean delivery time target: 40 minutes. If the CI is
(39, 45), the data do not clearly support that the mean is below 40. - Conversion target: 25%. If the CI is
(18.9%, 27.1%), the data are consistent with being below or above 25%.
Confidence Level vs. Margin of Error: A Concrete Comparison
Suppose you have the same estimate and SE, and you only change the confidence level:
| Confidence level | Typical critical value | Effect on interval width |
|---|---|---|
| 90% | z* ≈ 1.64 | Narrower |
| 95% | z* ≈ 1.96 | Wider |
| 99% | z* ≈ 2.58 | Widest |
If your SE is 0.021 (as in the conversion example), then:
- 90% ME ≈
1.64 × 0.021 = 0.034 - 95% ME ≈
1.96 × 0.021 = 0.041 - 99% ME ≈
2.58 × 0.021 = 0.054
The data did not change; only your desired long-run coverage did.
Common Misinterpretations (and Better Alternatives)
Misinterpretation 1: “There is a 95% chance the true value is in my interval.”
Why it’s wrong: the parameter is fixed; your interval is the random object (it would change if you resampled).
Better: “This method produces intervals that contain the true value 95% of the time in repeated sampling.”
Misinterpretation 2: “A wider interval means the data are bad.”
Why it’s misleading: width reflects uncertainty, which can be appropriate when samples are small or variability is high.
Better: “The estimate is imprecise; to narrow it, increase sample size or reduce measurement noise.”
Misinterpretation 3: “If 0 is not in the interval, the effect is important.”
Why it’s wrong: excluding 0 (for differences) or excluding 1 (for ratios) relates to statistical detectability, not practical impact. With large samples, tiny effects can be statistically detectable.
Better: judge importance by the size of the effect and whether the CI lies within a practically meaningful range. For example, a conversion lift CI of (+0.2%, +0.6%) might be statistically clear but operationally minor.
Misinterpretation 4: “Overlapping confidence intervals mean there is no difference.”
Why it’s unreliable: overlap is not a definitive test of difference; it depends on how the intervals were built and whether samples are independent.
Better: if comparing groups is the goal, construct a CI for the difference (or use a dedicated comparison method) rather than comparing two separate CIs informally.
Misinterpretation 5: “The confidence level measures data quality.”
Why it’s wrong: confidence level is a choice you make; it controls long-run coverage by widening or narrowing the interval.
Better: treat confidence level as a policy decision about risk tolerance, and treat CI width as the indicator of precision.
Practical Checklist Before You Trust a Confidence Interval
- Is the sample reasonably representative? If the sampling process is biased, a narrow CI can still be centered on the wrong value.
- Is the sample size adequate for the method? Very small samples can make simple approximations unreliable, especially for proportions near 0 or 1.
- Are observations reasonably independent? If data points are strongly related (e.g., repeated measures without accounting for it), the SE can be underestimated, producing intervals that are too narrow.
- Is the interval answering the right question? A CI for a mean answers a different question than a CI for a proportion or for a difference between groups.