Comments Regarding Probability
Definition of Probability
We will limit our discussion of probability to the definition that a probability is the relative frequency of the occurrence of the event in the long run.
||The probability of event X, P(X) is
- the number of times event X occurred (instead of other possibilities such as Y or Z); f(X)
- divided by
- the total number of times it could have occurred; f(T)
- provided that the number of times it could have occurred is very very large; n -> (infinity sign)
|Flip a coin 1,000 times. It comes up heads 480 times, tails 519 times, and lands on its edge once. What is the probably of the event "tails" when the coin is flipped?
||f(X) = 519
f(T) = 1000 (also known as "n")
P(X) = 519/100 = 0.519
But since n isn't infinitely large the value of 0.519 isn't really a probability.
Because we generally don't have an infinite number of trials to count, we never really have a probability. Instead we simply have a proportion. If I toss a coin 50 times and it comes up heads 23 times the probability is not (23/50) [or whatever the decimal value might be]--the number is only the relative frequency or proportion.
Probability values must be between 0.0 and 1.0. The limits are that X never occurs [P(X) = 0] or that it always occurs [P(X) = 1]. A lot of times people get sloppy and refer to probabilities as percentages, "The probability of precipitation is 30%." Just move the decimal to the left two places and drop the "%" sign if the sloppiness bothers you (as much as it does me).
Probability is Used Two Ways in Modern Science
- It is a substitute for certainty and/or truth.
- If an expert says that a farmer has a 0.90 probability of getting a successful crop where do we stand? If a farmer's crop fails, does that mean the expert is wrong? No! What the expert was saying is, "Among a large number of similar fields and under similar conditions as to weather, etc., about nine of ten can be expected to succeed."
- Ditto, weather forecasts. In the past when we have had these weather conditions it has rained 30% of the time (30 times in every 100 for example).
- Probability also refers to the "success rate" of the methods we use. For example:
- Suppose a pollster tells us that if the election were held today, 53% of the voters would vote for candiate "A" and that the margin of error is 8%.
- I probably would like to generalize from the clincial prediction (the sample of people actually measured) to the scientific prediction (the proportion of all voters who would vote for candiate "A").
- In the figure below we see the sample actually observed and other possible samples. Inferential statistical techniques allow me to make a statement like this: If I repeated the process of sampling again and again then I would discover that 95% of the samples were within 8% of the proportion of ALL voters who actually would vote for candidate "A".
- The only problem is I don't know if my sample is one of the 95% (19 of 20--the stippled circles) or one of the 5% (1 n 20--the black circle). Is my sample within 8% of the real value or is it somewhere else? Is it stippled-gray or is it black?
- And that's my success rate. In the long run, over many samples 19 of 20 (95%) will be within 8% of the true value.
- Actually the news media don't usually give us quite enough information to know precisely what the margin of error means, but this interpretation follows standard statistical practice
- Inferential statistics is used because we know how often (the proportion of times) the method will work and, conversely, how often it will fail. No other method of decision making provides information on success rates. Instead of using inferential statistics we could hire an expert to look at the data and decide whether the results support the hypothesis or not. But even after the expert tells us her decision, we don't know if it was correct and surely you don't believe an expert will be right every time.
© 2002 by BurrtonWoodruff. All rights reserved. Modified Sunday, March 25, 2007