Statistics
Hypothesis Testing
Year 12 · Year 13
- ✓By the end of this lesson students will be able to formulate null and alternative hypotheses for a given problem.
- ✓By the end of this lesson students will be able to perform hypothesis tests for binomial distributions, including calculating critical regions and p-values.
- ✓By the end of this lesson students will be able to perform hypothesis tests for a population mean using the Normal distribution.
- ✓By the end of this lesson students will be able to perform hypothesis tests for Pearson's product-moment correlation coefficient.
- ✓By the end of this lesson students will be able to interpret the results of hypothesis tests in context and understand Type I and Type II errors.
Key concepts
Hypothesis testing is a statistical method used to make decisions about a population parameter based on sample data. It involves setting up two competing hypotheses: the null hypothesis (H₀) and the alternative hypothesis (H₁). The null hypothesis represents a statement of no effect or no change, while the alternative hypothesis represents what we are trying to find evidence for. We calculate a test statistic from our sample data and compare it to a critical value or use it to find a p-value. The significance level (α) is the probability of incorrectly rejecting the null hypothesis when it is true (Type I error). A one-tailed test is used when the alternative hypothesis specifies a direction (e.g., greater than or less than), while a two-tailed test is used when the alternative hypothesis specifies a difference in either direction (e.g., not equal to). The critical region is the range of values for the test statistic that would lead to the rejection of H₀. The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming H₀ is true. If p-value < α, we reject H₀.
A Type I error occurs when the null hypothesis is rejected when it is actually true. The probability of a Type I error is equal to the significance level, α. A Type II error occurs when the null hypothesis is not rejected when it is actually false. The probability of a Type II error is denoted by β. We aim to minimise both types of errors, but reducing one often increases the other.
This test is used when we have a fixed number of trials (n), each with two possible outcomes (success/failure), and a constant probability of success (p) under the null hypothesis. We test a hypothesis about the population proportion (p). The test statistic is the number of successes (X) observed in the sample. We calculate the probability of observing a result as extreme as, or more extreme than, the sample result, assuming H₀ is true, using the binomial probability distribution. This probability is the p-value. If the p-value is less than the significance level, we reject H₀.
This test is used to determine if a sample mean (x̄) is significantly different from a hypothesised population mean (μ₀) when the population standard deviation (σ) is known, or when the sample size is large enough (n > 30) for the Central Limit Theorem to apply, allowing the sample mean to be approximated by a Normal distribution. The test statistic (z) measures how many standard errors the sample mean is away from the hypothesised population mean. We compare this z-value to critical values from the standard Normal distribution or use it to find a p-value.
This test is used to determine if there is a statistically significant linear relationship between two variables in a population, based on a sample's PMCC (r). The null hypothesis (H₀) typically states that there is no linear correlation in the population (ρ = 0), where ρ is the population PMCC. The alternative hypothesis (H₁) can be that there is a positive correlation (ρ > 0), a negative correlation (ρ < 0), or simply a non-zero correlation (ρ ≠ 0). We compare the calculated sample PMCC (r) to critical values found in statistical tables, which depend on the sample size (n) and the significance level (α). If the absolute value of r exceeds the critical value, we reject H₀.
Key facts to remember
- 1The null hypothesis (H₀) is a statement of no effect or no difference, while the alternative hypothesis (H₁) is what we are trying to find evidence for.
- 2The significance level (α) is the maximum probability of making a Type I error (rejecting H₀ when it is true).
- 3A one-tailed test is used when H₁ specifies a direction (e.g., p > 0.5), and a two-tailed test when H₁ specifies a difference (e.g., p ≠ 0.5).
- 4The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming H₀ is true.
- 5If p-value < α, reject H₀. If p-value ≥ α, do not reject H₀.
- 6For binomial tests, use the Binomial distribution B(n, p) and calculate cumulative probabilities.
- 7For Normal mean tests, use the z-test statistic z = (x̄ - μ₀) / (σ / √n) when σ is known or n is large.
- 8For correlation tests, compare the sample PMCC (r) to critical values from tables, based on sample size (n) and significance level (α).
Worked examples
Example 1
A manufacturer claims that 20% of their new light bulbs are faulty. A quality control manager tests a random sample of 50 light bulbs and finds 15 faulty ones. Test, at the 5% significance level, whether there is evidence that the proportion of faulty light bulbs is higher than the manufacturer's claim.
Answer
Reject H₀. There is evidence that the proportion of faulty light bulbs is higher than the manufacturer's claim.
For binomial tests, always use cumulative probabilities (P(X ≤ x) or P(X ≥ x)) to find the p-value or critical region.
Example 2
A company claims that the mean weight of their cereal boxes is 500g. A random sample of 40 boxes has a mean weight of 497g. The population standard deviation is known to be 8g. Test, at the 1% significance level, whether there is evidence that the mean weight of the cereal boxes is not 500g.
Answer
Do not reject H₀. There is insufficient evidence that the mean weight is not 500g.
Remember to divide the significance level by 2 for two-tailed tests when finding critical values or p-values from tables.
Example 3
A researcher wants to investigate if there is a positive linear correlation between hours of study (x) and exam scores (y) for a particular subject. A random sample of 12 students yields a Pearson's product-moment correlation coefficient (r) of 0.55. Test, at the 5% significance level, whether there is evidence of a positive linear correlation.
Answer
Reject H₀. There is evidence of a positive linear correlation.
Ensure you use the correct critical value from the tables, considering whether it's a one-tailed or two-tailed test and the correct significance level and sample size.
Common mistakes
- ✗Confusing Type I and Type II errors, or not understanding their implications.
- ✗Incorrectly formulating the null and alternative hypotheses, especially for one-tailed vs two-tailed tests.
- ✗Using the wrong distribution or formula for the test statistic (e.g., using Normal approximation for Binomial when n is too small, or using incorrect standard error).
- ✗Failing to adjust the significance level for two-tailed tests when finding critical values or p-values.
- ✗Not stating the conclusion in context of the original problem, or making definitive statements when only 'evidence' is found.
Exam tips
- ★Always clearly state H₀, H₁, and the significance level at the start of your solution.
- ★Draw a sketch of the distribution (Normal or Binomial) to visualise the critical region and test statistic, especially for two-tailed tests.
- ★Show all steps of your calculations, including the formula used, substitution of values, and the final test statistic or p-value.
- ★Ensure your conclusion is directly related to the problem's context and uses appropriate statistical language (e.g., 'sufficient evidence', 'insufficient evidence').
Ready to practise?
Try a problem on this topic
Snap a photo or type a question — get step-by-step working instantly.
