Analysis of Variance
ANOVA: What does it do and how does it do it?
Analysis of Variance ( aka "AOV" or "ANOVA") is an inferential statistical procedure.
- Inferential statistical procedures are ways to determine whether or not the results of a research study are likely to be repeatable with new samples.
ANOVA is used to determine whether or not the independent variable in an experiment was effective.
- This tutorial introduces the conceptual framework of Analysis of Variance--How it does what it does.
- This tutorial does not teach how to complete the calculations for ANOVA.
What you should know before continuing.
- The simplest experiment has two groups or conditions. Each group has several subjects.
- The only difference in the two groups is the "level" of the independent variable
- For example, in a experiment to determine whether or not caffeine changes how long it takes to go to sleep, the independent variable might be the number of "NoDoz" tables a person took just before bedtime. The "levels" might be 0 "NoDoz" tablets vs 1"NoDoz" tablet.
- And, of course, the "0" group would take a placebo tablet.
- The same measurement procedure would be applied to both groups. The measurement procedure measures the value of the dependent variable. This is the data.
- If the independent variable is effective then the average value of the dependent variable will be different in the two groups.
The data set
Two or more columns of numbers (observations). Each column contains the data that was collected under one level of the independent variable.
There are several strands of knowledge that must be braided to understand the rationale of ANOVA.
1. How will the data be different when the independent variable is effective than when it is NOT effective?
|
How statistical theory models an experiment:
- "Was the Independent Variable Effective?"
- The principle:
- ANOVA assumes an effective IV adds a constant to each data value.
- ANOVA assumes an effective IV does NOT affect dispersion.
- The statistical hypotheses
- Ho: mu1 = mu2 = ... = mur (given r levels of the IV)
- Ha: An inequality exists
- Additional Resource and illustrations are in the Teaching Statistics stack.
- General Linear Model
- An observation in an experiment is broken down into a sum
- The value of the observation =
- The overall mean
- PLUS How far the average of a group is from the overall mean
- PLUS How far an observation is from the average of the group.
- A complete illustration is given in the Teaching Statistics stak.
|
2. How is this difference (how the data differ when the IV is or is not effective) used to create the test statistic, F ?
|
- Rationale of ANOVA
- Additional Resources
- Teaching Statistics Stack
- If the Ho is true (IV was NOT effective), F will be 1.00 (given NO sampling error).
- If Ho is false (IV was effective), F will be greater than 1.
- Calculation of F.
- SS within groups (aka SS error), SS between groups, SS total
3. What is the sampling distribution of F if the Ho is true.
|
- The test statistic is F, aka "Variance Ratio" is the ratio of two sample variances.
- If the IV is NOT effective (the null hypothesis is true) then the samples that form the F-ratio are drawn from the same population.
4. How is the test statistic, F, calculated from a data set.
|
- Calculation and Meaning of the Sums of Squares in the Independent Groups ANOVA Table (One Way). The simulated data used in the examples is from the "Mental Rotation" experiment described "here".
- Example 1: Ho true
- Detailed descriptions are given in the page where the null hypothesis is true. These descriptions are not repeated in the page illustrating the instance when the null hypothesis is falce.
- Example 2: Ho false
5. How does the sampling distribution of F differ when Ho is true and when it is false.
|
- Using Monte Carlo techniques to show how the sampling distribution of F is different when the null hypotheis is false than when it is true.
- Excel spreadsheet (download) example for determining whether Digit and Letter Memory Spans differ
- Examples of One-Way ANOVA
- "One-Way" signifies there is one independent variable (Digit vs Letter). If I had two independent variables (Digit vs Letter; Male vs Female) that would be a "Two-Way" ANOVA. A Three-Way ANOVA would have data values that depend on three independent variables; for example, Digit vs Letter; Male vs Femaile; 20-year-old vs 50-year-old. The scheme can be extended.
- Independent Groups ANOVA
- Dependent Groups ANOVA
- Problem Sheet
© 2002 - 2006 by BurrtonWoodruff. All rights reserved. Modified Sunday, March 25, 2007