±">
Mean, proportion & variance • 2026 edition
Mean (σ known): \( \bar{x} \pm Z \cdot \frac{\sigma}{\sqrt{n}} \)
Mean (σ unknown): \( \bar{x} \pm t \cdot \frac{s}{\sqrt{n}} \)
Proportion: \( \hat{p} \pm Z \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \)
Difference of Means: \( (\bar{x}_1 - \bar{x}_2) \pm t \cdot SE \)
Difference of Proportions: \( (\hat{p}_1 - \hat{p}_2) \pm Z \cdot SE \)
Confidence intervals provide a range of plausible values for population parameters based on sample statistics. They express the uncertainty in our estimates with a specified level of confidence (e.g., 95%). The width of the interval depends on sample size, variability, and confidence level.
| Confidence Level | Z-Score | Margin of Error | Interval Bounds |
|---|
A confidence interval provides a range of values that is likely to contain an unknown population parameter with a specified level of confidence. For example, a 95% confidence interval means that if we were to take many samples and construct confidence intervals from each sample, about 95% of those intervals would contain the true population parameter.
\( \bar{x} \pm t \cdot \frac{s}{\sqrt{n}} \)
Uses t-distribution when population standard deviation is unknown.
\( \hat{p} \pm Z \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \)
For estimating population proportion from sample proportion.
What does a 95% confidence interval mean?
The answer is B) If we repeated the sampling process many times, about 95% of the intervals would contain the true parameter. This is the correct frequentist interpretation of confidence intervals. The confidence level refers to the long-run proportion of intervals that would contain the true parameter if we repeated the sampling process many times.
This is a crucial distinction in statistical inference. A confidence interval does not provide a probability that the true parameter is in a specific interval. Instead, it describes the reliability of the estimation procedure. Once we calculate a specific interval, the true parameter is either in it (probability 1) or not in it (probability 0), but we don't know which.
Confidence Interval: Range of values likely to contain population parameter
Frequentist Interpretation: Long-run proportion of intervals containing true parameter
Statistical Inference: Drawing conclusions about population from sample
• CI refers to procedure reliability, not specific interval
• True parameter is fixed, interval is random
• Confidence level is long-run probability
• Think of CI as capturing the parameter in 95% of samples
• The true parameter is fixed, not probabilistic
• Confidence is about the method, not the result
• Thinking the parameter has 95% probability of being in interval
• Confusing confidence level with probability of specific interval
• Forgetting that parameter is fixed in frequentist view
A sample of 25 students has a mean score of 78 with a standard deviation of 12. Calculate the 95% confidence interval for the population mean. Use t-distribution. Show your work.
Step 1: Identify the parameters
n = 25, x̄ = 78, s = 12, confidence level = 95%
Step 2: Calculate degrees of freedom
df = n - 1 = 25 - 1 = 24
Step 3: Find the t-value for 95% confidence with df = 24
t₀.₀₂₅,₂₄ ≈ 2.064
Step 4: Calculate the standard error
SE = s/√n = 12/√25 = 12/5 = 2.4
Step 5: Calculate the margin of error
ME = t × SE = 2.064 × 2.4 = 4.954
Step 6: Calculate the confidence interval
Lower bound = x̄ - ME = 78 - 4.954 = 73.046
Upper bound = x̄ + ME = 78 + 4.954 = 82.954
The 95% confidence interval is (73.05, 82.95).
When the population standard deviation is unknown (which is usually the case), we use the t-distribution instead of the normal distribution. The t-distribution has heavier tails than the normal distribution, especially for small samples, which accounts for the additional uncertainty in estimating the population standard deviation from the sample.
Degrees of Freedom: Number of independent pieces of information
Standard Error: Standard deviation of sampling distribution
t-Distribution: Distribution for small samples with unknown σ
• Use t-distribution when σ is unknown
• df = n - 1 for mean CI
• SE = s/√n
• Always use t-distribution when σ is unknown
• For n > 30, t ≈ Z
• Remember to calculate SE before ME
• Using Z instead of t when σ is unknown
• Forgetting to calculate standard error
• Wrong degrees of freedom calculation
In a survey of 400 voters, 240 said they support a particular candidate. Calculate the 95% confidence interval for the true proportion of voters who support this candidate. Show your work.
Step 1: Calculate the sample proportion
p̂ = 240/400 = 0.60
Step 2: Identify parameters
n = 400, p̂ = 0.60, confidence level = 95%
Step 3: Find the Z-score for 95% confidence
Z = 1.96
Step 4: Calculate the standard error
SE = √[p̂(1-p̂)/n] = √[0.60 × 0.40 / 400] = √[0.24 / 400] = √0.0006 = 0.0245
Step 5: Calculate the margin of error
ME = Z × SE = 1.96 × 0.0245 = 0.0480
Step 6: Calculate the confidence interval
Lower bound = p̂ - ME = 0.60 - 0.0480 = 0.552
Upper bound = p̂ + ME = 0.60 + 0.0480 = 0.648
The 95% confidence interval is (0.552, 0.648) or (55.2%, 64.8%).
For proportions, we use the normal distribution (Z) when the sample size is large enough. The conditions for using the normal approximation are np̂ ≥ 5 and n(1-p̂) ≥ 5. In this case, 400 × 0.6 = 240 ≥ 5 and 400 × 0.4 = 160 ≥ 5, so the normal approximation is appropriate.
Sample Proportion: p̂ = number of successes / sample size
Normal Approximation: Using normal distribution for binomial
Success-Failure Condition: np ≥ 5 and n(1-p) ≥ 5
• Check success-failure condition first
• SE = √[p̂(1-p̂)/n]
• Use Z-distribution for proportions
• Always check if normal approximation is valid
• Express final answer as percentage if needed
• Remember p̂(1-p̂) is maximized at p̂ = 0.5
• Not checking conditions for normal approximation
• Using wrong formula for standard error
• Confusing proportion with count
A researcher initially calculates a 95% confidence interval for a mean with n=100 and gets (48.5, 51.5). If the researcher wants to halve the margin of error while keeping the same confidence level, what sample size is needed? Explain the relationship between sample size and margin of error.
Step 1: Calculate the current margin of error
Current interval: (48.5, 51.5)
Current ME = (51.5 - 48.5) / 2 = 1.5
Desired ME = 1.5 / 2 = 0.75
Step 2: Understand the relationship
ME = t × (s/√n), so ME ∝ 1/√n
Step 3: Calculate the required sample size
If ME needs to be halved, then √n needs to be doubled
So n needs to be multiplied by 4
New n = 100 × 4 = 400
Step 4: Verification
With n = 400, the new standard error becomes s/√400 = s/20
Compared to original: s/√100 = s/10
New SE is half of original SE, so new ME is half of original ME
The researcher needs a sample size of 400 to halve the margin of error.
The margin of error is inversely proportional to the square root of the sample size. This means that to halve the margin of error, we need to quadruple the sample size. This relationship has important implications for research design: improving precision significantly increases resource requirements. This is why researchers often balance desired precision with practical constraints.
Margin of Error: Half the width of the confidence interval
Sample Size Effect: Relationship between n and precision
Resource Trade-off: Balancing precision and cost
• ME ∝ 1/√n
• To halve ME, quadruple n
• Precision improvements require exponential resource increases
• ME = (upper bound - lower bound) / 2
• To reduce ME by factor k, increase n by factor k²
• Consider practical constraints when choosing sample size
• Thinking ME decreases linearly with n
• Forgetting the square root relationship
• Not considering practical feasibility of large samples
Which statement about confidence levels is TRUE?
The answer is B) Higher confidence level results in wider interval. This is because higher confidence levels require larger critical values (Z or t), which increase the margin of error. For example, a 99% confidence interval uses Z = 2.576 compared to Z = 1.96 for 95% confidence, resulting in a wider interval.
There's a fundamental trade-off between confidence and precision. To be more confident that our interval contains the true parameter, we must accept a wider interval. Conversely, to get a more precise (narrower) interval, we must accept lower confidence. This is a fundamental limitation in statistical inference that researchers must consider when designing studies.
Confidence-Precision Trade-off: Relationship between confidence and interval width
Critical Value: Z or t value for given confidence level
Statistical Limitation: Fundamental constraints in inference
• Confidence level ↑ → Interval width ↑
• Precision ↑ → Confidence ↓ (for fixed sample)
• Trade-off is fundamental to statistical inference
• Higher confidence = wider interval
• Need both high confidence and high precision? Increase sample size!
• Common confidence levels: 90%, 95%, 99%
• Thinking higher confidence means narrower interval
• Forgetting the trade-off between confidence and precision
• Assuming 100% confidence is possible with finite samples
Q: When should I use t-distribution vs normal distribution for confidence intervals?
A: Use these guidelines:
For large samples (n > 30), t and Z values are very similar, so either can be used. The t-distribution has heavier tails to account for uncertainty in estimating σ from the sample.
Q: How are confidence intervals used in real-world research?
A: Confidence intervals are used extensively:
They provide crucial information about the precision of estimates and the uncertainty in statistical results.