




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An introduction to hypothesis testing, focusing on the concepts of null and alternative hypotheses, types of errors, and significance levels. The text uses the example of testing the fairness of a coin to illustrate these concepts. It also discusses the importance of setting a level of significance and calculating a test statistic to make a conclusion.
Typology: Exams
1 / 8
This page cannot be seen from the preview
Don't miss anything!
STA100 Lecture
Text: Section 9.1, 9.
Hypothesis Tests We test hypotheses and draw inferences all the time. Suppose you want to buy a used car. At the risk of oversimplifying, you might buy a โlemonโ or you might buy a โterrificโ car. Unless you are a mechanic you probably donโt really know which type the car is until after you are done weighing your decision and have spent thousands of dollars..
If you think about it, there are four possibilities. Buy a lemon, buy a great car, pass up a lemon, pass up a great car. You might try to โplay the averagesโ buy consulting Kelly Blue Book or some other sight to read reviews, etc, but this doesnโt really tell you about the car in front of you.
It is much the same thing in a court case. If you are on a jury you must decide whether the person who has been accused actually did the crime or not. Even if they confess youโll never really know whether they are guilty, but you still must make a decision.
Hereโs a table summarizing the situation:
The accused actually did not do the crime
The accused actually did do the crime You find them guilty Incorrect choice Correct choice
You find them not guilty Correct choice Incorrect choice
As you can see there are two types of errors possible here: sending an innocent person to jail and letting a guilty person go free. No court system can possibly be error free all the time. Given that the world is imperfect, how do we proceed intelligently?
A little less dramatically, suppose you have a coin in front of you, and you and I are betting $1 on each coin toss. You probably donโt trust me entirely, so I allow you to examine the game coin. What would you do to ensure the game is fair?
Since you donโt have a โcoin-o-meterโ in your laboratory, I guess youโre going to start tossing. Suppose you toss 1000 times and obtain 540 Heads. At this point can you reasonably conclude anything? Since we know about population parameters like the population proportion, and about probability models like the binomial distribution, we have everything we need to move forward.
Here is what we are really asking about the coin: โIs the long term proportion of heads delivered by the coin 50%?โ If we call the population proportion of heads delivered by the coin ๐ we are asking โIs ๐ = 0.5 ?โ All we can do at this point is construct a confidence interval and see whether it includes 0.5. Do you see that, while you must make a choice as to whether the coin is fair or not, you can never be sure? Itโs possible for a fair coin to give 540 heads on 1000 tosses, just as it is possible for an unfair coin to give 540 heads on 1000 tosses. The interesting question is: Is it likely?
Hereโs a 95% confidence interval:
Your best guess is that the coin is not fair since the interval does not include 0.5.
Before we proceed much further we need to develop a few terms. These are all defined in your text book- take a moment to look them up before seeing what I have to say about them.
was fair. This is called a Type I error. The other type of error is failing to detect that the coin was biased (heh, heh, hehโฆ) when in fact the coin was not fair. This is called a Type II error. So, we have:
๐ป_ 0 is true ๐ป_ 0 is not true
Reject ๐ป_ (^0) Incorrect choice, Type I error Correct choice
Do Not Reject ๐ป_ (^0) Correct choice Incorrect choice, Type II error
In our court case, even if you vote โguiltyโ you are never absolutely sure that the accused committed the crime. You donโt need to be sure, just believe guilt โbeyond a reasonable doubtโ, not beyond all doubt. Itโs the same in the coin toss. Even if you get 1000 Heads on 1000 tosses (an event with probability 9.3326โฆe-302 (thatโs 301 zeros before the 93326โฆ) you are not absolutely sure the coin is biased. Tiny is not zero. Miniscule is not zero. Thus you have to draw the line somewhere. It is common in statistical tests to use the number 5% as a level of significance. This is the error rate we are willing to accept when rejecting ๐ป_0. We call this number ๐ผ (alpha) and say that
the level of significance โก ๐ผ โก ๐ก๐๐ ๐๐๐๐๐๐๐๐๐๐ก๐ฆ ๐๐ ๐๐๐๐๐๐ก๐๐๐ ๐ ๐ก๐๐ข๐ ๐๐ข๐๐ ๐๐ฆ๐๐๐ก๐๐๐ ๐๐
If you would like your level of significance to be zero you havenโt been paying attention. The only way to ensure that no innocent people go to jail is to create a system in which absolutely everyone is set free. That would be chaos.
On the other hand, we need to control our Type II error rate as well. Since we call the probability of committing a Type I error (reject true null) ๐ผ, you wonโt be surprised that we call the probability of committing a Type II error (accept false null) ๐ฝ. Type II errors are a little harder to deal with and we pretty much ignore them in this course. I canโt leave the topic before mentioning that people define the power of a statistical test as the probability we reject a null hypothesis when it is false.
Once we know what we want to test (๐ค๐ ๐๐๐๐ค ๐ป 0 ๐๐๐ ๐ป_1 ) and we know how often we are willing to make a mistake in the direction of rejecting a true null ( ๐ค๐ ๐๐๐๐ค ๐ผ ) we must consider some sort of
evidence. To do this in a statistical test we compute some relevant statistic. In our coin toss example we should calculate the probability that we would get as many as 540 Heads on 1000 tosses with a fair coin. If this probability is too small (typically if it is less that 5%) we will conclude that the coin is not fair. If you think about it for a moment, we would consider a coin that delivered too few Heads to be biased as well. Thus we will calculate the probability that a fair coin would give a result as extreme as 540 Heads. Use the normal approximation to the binomial random variable and compute as follows:
๏ท Convert the target value to standard units
๏ท Determine how likely it is to see a data point (statistic) this far away from the hypothesized population parameter.
๐ โ2.5298 < ๐ง < 2.5298 โ ๐ โ2.53 < ๐ง < 2.53 = 0.9943 โ 0.0057 = 0.
So, the probability of observing a test statistic as extreme as ours is 1 โ 0.9886 = 0.0114. Not impossible, but pretty unlikely. In fact we agreed to reject our null hypothesis if we saw data that was less likely that 5%. Since 0.0114 < 0.05 we reject the null hypothesis and conclude that the coin is not fair.
Notice that I was sneaky and made you compute a ๐-value without even telling you about it. We define a ๐-value as the probability of observing data as extreme as the data you collected. Note that we reject our null hypothesis when we observe unlikely data, thus
We reject the null hypothesis when our ๐ -value is smaller than ๐ถ.
Assume for the sake of the test that ๐ป 0 : ๐๐น๐๐๐๐ = 50_. Since we feel that these folks will tend to have a lower Life Satisfaction and we wish to refute the null hypothesis, we will conduct a 1 tailed test and take as our alternative_ ๐ป 1 : ๐๐น๐๐๐๐ < 50_. We will then reject the null hypothesis if our data are โtoo lowโ. Make sure you read about one versus two tailed tests in your textbook!_
When we do this we really arenโt doing math anymore, just trying to proceed as best we reasonably can. It is very common to take ๐ผ = 0.05_. It is also common to use_ ๐ผ = 0.. To see this in action we will use this smaller value here.
Since we are dealing with means and with large sample sizes we will compute a ๐ง statistic. Remember that
๐ง = ๐ฅ โ ๐๐ ๐
tailed test). We find the probability that ๐ง < -2.28. From our z table I get p=0.0113. Since this number is not less than ๐ผ = 0.01 we may not reject the null hypothesis at the ๐ผ = 0.01 level of significance. (Note that if we had been less picky we could have rejected at the ๐ผ = 0.05 level of significance. It all depends upon where you initially set up the goal posts.
accept, just like the courts never say that someone is innocent.
We were not able to reject the null hypothesis at the stated level of significance. These data are insufficient to conclude that individuals with Fibromyalgia have a lower Life Satisfaction than the general population.
Here is your first presentation problem of the week. Suppose we had instead used ๐ = ๐๐ for a sample size and did not know the standard deviation but instead had to estimate it with ๐ = ๐. If we had obtained ๐ = ๐๐ would we be able to conclude ๐ฏ๐: ๐๐ญ๐๐๐๐ < 50 at the ๐ถ = ๐. ๐๐ level of significance? This is called a t-test and is the subject of the next lecture.