Chapter 21 More About Tests Zero In on the Null • the null hypothesis must be a statement about the value of a parameter for a model • we use this value to compute the probability that the observed sample statistic (or something farther from the null) would occur • the null hypothesis comes from the context of the problem or situation NOT the data
How to Think About P-values • A P-value is a conditional probability. It tells us the probability of getting results at least as unusual as the observed statistic given that the null hypothesis is true. P( data | H0 ) • The P-value is NOT the probability that the null hypothesis is true! • The lower the P-value, the more comfortable we feel about our decision to reject the null hypothesis, BUT the null hypothesis does not get anymore false.
Alpha Levels We can define “rare event” arbitrarily by setting a threshold for our P-value. If our P-value falls below that point, we will reject the null hypothesis. We call such results statistically significant. Symbol: α Common Levels: 0.1, 0.05, 0.01
Traditional Critical Values from the Normal Model α 0.1 0.05 0.01 0.001
1-sided 1.282 1.645 2.33 3.09
2-sided 1.645 1.96 2.576 3.29
We find these values on the t-table. Use tail probability and the df = ∞ line.
If one-sided, α is all on one side.
If two-sided, α is split equally into the tails.
Making Errors Type I Error – the null hypothesis is TRUE, but we mistakenly reject it Type II Error – the null hypothesis is FALSE, but we fail to reject it ** When you choose an alpha level, you are setting the the probability of a Type I Error.
Power • we can never prove a null hypothesis is true; we only fail to reject it • when we fail to reject a null hypothesis, it is natural to wonder whether we looked hard enough (Was our test too weak to tell?) Power – is a test’s ability to detect a false null hypothesis or the probability that a test correctly rejects a false null hypothesis When the power is high, we can be confident that we’ve looked hard enough at the situation.
The value of the power depends on how far the truth lies from the null hypothesis value. The distance between the null hypothesis value, p0, and the truth, p, is called the effect size. Power depends directly on effect size. Power = 1 – β where β is the probability of a Type II Error
If we reduce the probability of a Type I Error ( α ), we automatically increase the probability of a Type II Error ( β ). We can reduce the probability of BOTH Type I and Type II Errors by reducing the standard deviation (spread). The power is also increased. Take a larger sample!
What Can Go Wrong? • Don’t interpret the P-value as the probability that H0 is true. – The P-value is about the data, not the hypothesis. – It’s the probability of the data given that H0 is true, not the other way around. • Don’t believe too strongly in arbitrary alpha levels. – It’s better to report your P-value and a confidence interval so that the reader can make her/his own decision.
• Don’t confuse practical and statistical significance. – Just because a test is statistically significant doesn’t mean that it is significant in practice. – And, sample size can impact your decision about a null hypothesis, making you miss an important difference or find an “insignificant” difference. • Don’t forget that in spite of all your care, you might make a wrong decision.