📊">
Statistics & probability • 2026 edition
Mean: \( \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} \)
Median: Middle value when data is sorted
Mode: Most frequently occurring value(s)
Range: \( \text{Max} - \text{Min} \)
Standard Deviation: \( \sigma = \sqrt{\frac{\sum(x_i - \mu)^2}{N}} \)
Variance: \( \sigma^2 = \frac{\sum(x_i - \mu)^2}{N} \)
These measures describe the central tendency and spread of a dataset. The mean is the average, the median is the middle value, the mode is the most frequent value, and the range shows the spread between highest and lowest values.
Statistical measures describe the central tendency and spread of a dataset. The mean is the average value, the median is the middle value when sorted, the mode is the most frequently occurring value, and the range shows the spread between the highest and lowest values.
\( \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} \)
Where x̄ is the mean, xi represents each data value, and n is the number of values.
Sort data and find the middle value. If n is odd, median is the middle value. If n is even, median is average of two middle values.
What is the mean of the numbers: 12, 15, 18, 20, 25?
The answer is A) 18. To find the mean, add all values and divide by the count: (12 + 15 + 18 + 20 + 25) ÷ 5 = 90 ÷ 5 = 18. The mean represents the average value of the dataset.
The mean (or average) is calculated by summing all values in a dataset and dividing by the number of values. This gives us a measure of central tendency that represents a typical value in the dataset. The mean is sensitive to outliers, meaning extreme values can significantly affect the result.
Mean: Sum of all values divided by the number of values
Central Tendency: Measure that represents the center of a dataset
Outliers: Extreme values that differ significantly from other observations
• Mean = Sum of values ÷ Number of values
• Mean is affected by every value in the dataset
• Mean can be a value not present in the dataset
• Add all numbers first, then divide
• Count the number of values carefully
• Check if result seems reasonable
• Forgetting to divide by the count
• Missing a value in the sum
• Dividing by the wrong number of values
Find the median of the numbers: 5, 12, 8, 15, 3, 10, 7. Show your work.
Step 1: Sort the data in ascending order
3, 5, 7, 8, 10, 12, 15
Step 2: Count the number of values
n = 7 (odd number)
Step 3: Find the middle position
For odd n, median is at position (n+1)/2 = (7+1)/2 = 4
Step 4: Identify the median
The 4th value in the sorted list is 8
The median is 8.
The median is the middle value when data is arranged in order. For an odd number of values, the median is the value at position (n+1)/2. For an even number of values, the median is the average of the two middle values. The median is robust to outliers and represents the 50th percentile.
Median: Middle value when data is sorted
Robust Statistic: Not affected by outliers
50th Percentile: Value below which 50% of data falls
• Must sort data first
• Odd n: median at position (n+1)/2
• Even n: median = average of middle two values
• Always sort data before finding median
• Count positions carefully
• For even count, average the two middle values
• Forgetting to sort the data first
• Using wrong formula for odd/even count
• Miscounting positions
A teacher records the number of books read by students in a month: 3, 5, 7, 5, 8, 3, 5, 6, 4, 5. What is the mode, and what does it tell the teacher about reading habits?
Step 1: Organize the data to count frequencies
3 appears 2 times
4 appears 1 time
5 appears 4 times
6 appears 1 time
7 appears 1 time
8 appears 1 time
Step 2: Identify the mode
The value 5 appears most frequently (4 times)
The mode is 5 books.
This tells the teacher that 5 books is the most common number of books read by students in a month.
The mode is the value that appears most frequently in a dataset. It's the only measure of central tendency that can be used with categorical data. The mode helps identify the most common or popular value. A dataset can have no mode (all values appear equally), one mode (unimodal), two modes (bimodal), or multiple modes (multimodal).
Mode: Most frequently occurring value in dataset
Frequency: How often a value occurs
Unimodal: Dataset with one mode
Bimodal: Dataset with two modes
• Mode = most frequent value
• Can have multiple modes or no mode
• Useful for categorical data
• Count frequency of each value
• The most frequent value is the mode
• Can have more than one mode
• Confusing mode with mean or median
• Not counting all occurrences
• Forgetting that multiple modes are possible
The daily temperatures for a week were: 68°F, 72°F, 70°F, 75°F, 69°F, 71°F, 95°F. Calculate the range and discuss how the outlier affects it. What would the range be without the outlier?
Step 1: Identify min and max values
With outlier: Min = 68°F, Max = 95°F
Range = Max - Min = 95 - 68 = 27°F
Step 2: Identify the outlier
95°F is significantly higher than other temperatures (most are 68-75°F)
Step 3: Calculate range without outlier
Without outlier: Min = 68°F, Max = 75°F
New Range = 75 - 68 = 7°F
The outlier increases the range from 7°F to 27°F, making it 20°F larger.
The range is calculated as the difference between the maximum and minimum values. It's the simplest measure of spread but is highly sensitive to outliers. Outliers can dramatically increase the range, potentially giving a misleading impression of the data's variability. Other measures of spread like standard deviation or interquartile range are more robust to outliers.
Range: Difference between max and min values
Outlier: Extreme value that differs significantly from others
Measure of Spread: Quantifies variability in data
• Range = Maximum - Minimum
• Range is sensitive to outliers
• Range only uses two values
• Always identify min and max values
• Consider if outliers affect the range
• Use range with caution when outliers are present
• Forgetting to identify the actual min/max
• Not considering impact of outliers
• Using incorrect order in subtraction
Which statistical measure is LEAST affected by outliers?
The answer is B) Median. The median is robust to outliers because it only depends on the middle value(s) when the data is sorted. The mean, range, and standard deviation are all affected by outliers: the mean shifts toward the outlier, the range increases dramatically, and the standard deviation increases due to the increased spread.
Different statistical measures have varying sensitivity to outliers. The median is robust because it's based only on the middle value(s) in an ordered dataset. The mean is sensitive because it incorporates all values equally. The range is highly sensitive because it only considers the two extreme values. Standard deviation is sensitive because it measures the average distance from the mean.
Robust Statistic: Resistant to outliers
Sensitive Statistic: Affected by outliers
Outlier Impact: How statistics change with extreme values
• Median: Robust to outliers
• Mean: Sensitive to outliers
• Range: Highly sensitive to outliers
• Use median when outliers are present
• Consider multiple measures together
• Identify outliers before choosing statistics
• Assuming all statistics are equally affected by outliers
• Not considering the impact of outliers on results
• Using inappropriate statistic for dataset with outliers
Q: When should I use mean vs median vs mode?
A: Use these measures based on your data and purpose:
For symmetric data without outliers, mean = median = mode. For skewed data, these values differ significantly.
Q: How are statistical measures used in real-world applications?
A: Statistical measures are essential in many fields:
These measures help summarize data, identify patterns, and support decision-making processes.