Analysis of Variance
ANOVA: What does it do and how does it do it?
Analysis of Variance ( aka "AOV" or "ANOVA") is an inferential statistical procedure.
 Inferential statistical procedures are ways to determine whether or not the results of a research study are likely to be repeatable with new samples.
ANOVA is used to determine whether or not the independent variable in an experiment was effective.
 This tutorial introduces the conceptual framework of Analysis of VarianceHow it does what it does.
 This tutorial does not teach how to complete the calculations for ANOVA.
What you should know before continuing.
 The simplest experiment has two groups or conditions. Each group has several subjects.
 The only difference in the two groups is the "level" of the independent variable
 For example, in a experiment to determine whether or not caffeine changes how long it takes to go to sleep, the independent variable might be the number of "NoDoz" tables a person took just before bedtime. The "levels" might be 0 "NoDoz" tablets vs 1"NoDoz" tablet.
 And, of course, the "0" group would take a placebo tablet.
 The same measurement procedure would be applied to both groups. The measurement procedure measures the value of the dependent variable. This is the data.
 If the independent variable is effective then the average value of the dependent variable will be different in the two groups.
The data set
Two or more columns of numbers (observations). Each column contains the data that was collected under one level of the independent variable.
There are several strands of knowledge that must be braided to understand the rationale of ANOVA.
1. How will the data be different when the independent variable is effective than when it is NOT effective?

How statistical theory models an experiment:
 "Was the Independent Variable Effective?"
 The principle:
 ANOVA assumes an effective IV adds a constant to each data value.
 ANOVA assumes an effective IV does NOT affect dispersion.
 The statistical hypotheses
 Ho: mu_{1} = mu_{2} = ... = mu_{r }(given r levels of the IV)
 Ha: An inequality exists
 Additional Resource and illustrations are in the Teaching Statistics stack.
 General Linear Model
 An observation in an experiment is broken down into a sum
 The value of the observation =
 The overall mean
 PLUS How far the average of a group is from the overall mean
 PLUS How far an observation is from the average of the group.
 A complete illustration is given in the Teaching Statistics stak.
2. How is this difference (how the data differ when the IV is or is not effective) used to create the test statistic, F ?

 Rationale of ANOVA
 Additional Resources
 Teaching Statistics Stack
 If the Ho is true (IV was NOT effective), F will be 1.00 (given NO sampling error).
 If Ho is false (IV was effective), F will be greater than 1.
 Calculation of F.
 SS within groups (aka SS error), SS between groups, SS total
3. What is the sampling distribution of F if the Ho is true.

 The test statistic is F, aka "Variance Ratio" is the ratio of two sample variances.
 If the IV is NOT effective (the null hypothesis is true) then the samples that form the Fratio are drawn from the same population.
4. How is the test statistic, F, calculated from a data set.

 Calculation and Meaning of the Sums of Squares in the Independent Groups ANOVA Table (One Way). The simulated data used in the examples is from the "Mental Rotation" experiment described "here".
 Example 1: Ho true
 Detailed descriptions are given in the page where the null hypothesis is true. These descriptions are not repeated in the page illustrating the instance when the null hypothesis is falce.
 Example 2: Ho false
5. How does the sampling distribution of F differ when Ho is true and when it is false.

 Using Monte Carlo techniques to show how the sampling distribution of F is different when the null hypotheis is false than when it is true.
 Excel spreadsheet (download) example for determining whether Digit and Letter Memory Spans differ
 Examples of OneWay ANOVA
 "OneWay" signifies there is one independent variable (Digit vs Letter). If I had two independent variables (Digit vs Letter; Male vs Female) that would be a "TwoWay" ANOVA. A ThreeWay ANOVA would have data values that depend on three independent variables; for example, Digit vs Letter; Male vs Femaile; 20yearold vs 50yearold. The scheme can be extended.
 Independent Groups ANOVA
 Dependent Groups ANOVA
 Problem Sheet
© 2002  2006 by BurrtonWoodruff. All rights reserved. Modified Sunday, March 25, 2007