Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Understanding Hypothesis Testing: Null & Alternative Hypotheses, Errors, Significance - Pr, Exams of Data Analysis & Statistical Methods

State University of New York Polytechnic - Utica-Rome Data Analysis & Statistical Methods

Prof. William Thistleton

An introduction to hypothesis testing, focusing on the concepts of null and alternative hypotheses, types of errors, and significance levels. The text uses the example of testing the fairness of a coin to illustrate these concepts. It also discusses the importance of setting a level of significance and calculating a test statistic to make a conclusion.

Typology: Exams

Pre 2010

Uploaded on 08/09/2009

koofers-user-0bm 🇺🇸

10 documents

1 / 8

This page cannot be seen from the preview

Don't miss anything!

1 | P a g e

STA100

Lecture19

Text: Section 9.1, 9.2

Hypothesis Tests

We test hypotheses and draw inferences all the time. Suppose you want to buy a used car. At the risk

of oversimplifying, you might buy a “lemon” or you might buy a “terrific” car. Unless you are a

mechanic you probably don’t really know which type the car is until after you are done weighing your

decision and have spent thousands of dollars..

If you think about it, there are four possibilities. Buy a lemon, buy a great car, pass up a lemon, pass

up a great car. You might try to “play the averages” buy consulting Kelly Blue Book or some other

sight to read reviews, etc, but this doesn’t really tell you about the car in front of you.

It is much the same thing in a court case. If you are on a jury you must decide whether the person

who has been accused actually did the crime or not. Even if they confess you’ll never really know

whether they are guilty, but you still must make a decision.

Here’s a table summarizing the situation:

The accused actually

did not

do the crime

The accused actually

did

do the crime

You find them guilty

Incorrect choice

Correct choice

You find them not guilty

Correct choice

Incorrect choice

As you can see there are two types of errors possible here: sending an innocent person to jail and

letting a guilty person go free. No court system can possibly be error free all the time. Given that the

world is imperfect, how do we proceed intelligently?

Partial preview of the text

Download Understanding Hypothesis Testing: Null & Alternative Hypotheses, Errors, Significance - Pr and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

STA100 Lecture

Text: Section 9.1, 9.

Hypothesis Tests We test hypotheses and draw inferences all the time. Suppose you want to buy a used car. At the risk of oversimplifying, you might buy a “lemon” or you might buy a “terrific” car. Unless you are a mechanic you probably don’t really know which type the car is until after you are done weighing your decision and have spent thousands of dollars..

If you think about it, there are four possibilities. Buy a lemon, buy a great car, pass up a lemon, pass up a great car. You might try to “play the averages” buy consulting Kelly Blue Book or some other sight to read reviews, etc, but this doesn’t really tell you about the car in front of you.

It is much the same thing in a court case. If you are on a jury you must decide whether the person who has been accused actually did the crime or not. Even if they confess you’ll never really know whether they are guilty, but you still must make a decision.

Here’s a table summarizing the situation:

The accused actually did not do the crime

The accused actually did do the crime You find them guilty Incorrect choice Correct choice

You find them not guilty Correct choice Incorrect choice

As you can see there are two types of errors possible here: sending an innocent person to jail and letting a guilty person go free. No court system can possibly be error free all the time. Given that the world is imperfect, how do we proceed intelligently?

A little less dramatically, suppose you have a coin in front of you, and you and I are betting $1 on each coin toss. You probably don’t trust me entirely, so I allow you to examine the game coin. What would you do to ensure the game is fair?

Since you don’t have a “coin-o-meter” in your laboratory, I guess you’re going to start tossing. Suppose you toss 1000 times and obtain 540 Heads. At this point can you reasonably conclude anything? Since we know about population parameters like the population proportion, and about probability models like the binomial distribution, we have everything we need to move forward.

Here is what we are really asking about the coin: “Is the long term proportion of heads delivered by the coin 50%?” If we call the population proportion of heads delivered by the coin 𝑝 we are asking “Is 𝑝 = 0.5 ?” All we can do at this point is construct a confidence interval and see whether it includes 0.5. Do you see that, while you must make a choice as to whether the coin is fair or not, you can never be sure? It’s possible for a fair coin to give 540 heads on 1000 tosses, just as it is possible for an unfair coin to give 540 heads on 1000 tosses. The interesting question is: Is it likely?

Here’s a 95% confidence interval:

𝑝 ± 𝑧𝑐^ 𝑝^1 𝑛− 𝑝

0.54 ± 1.96 0.54^10001 −^ 0.

Your best guess is that the coin is not fair since the interval does not include 0.5.

Before we proceed much further we need to develop a few terms. These are all defined in your text book- take a moment to look them up before seeing what I have to say about them.

Null and Alternative Hypothesis
Types of Errors
Left tailed, right tailed, and two tailed tests

was fair. This is called a Type I error. The other type of error is failing to detect that the coin was biased (heh, heh, heh…) when in fact the coin was not fair. This is called a Type II error. So, we have:

𝐻_ 0 is true 𝐻_ 0 is not true

Reject 𝐻_ (^0) Incorrect choice, Type I error Correct choice

Do Not Reject 𝐻_ (^0) Correct choice Incorrect choice, Type II error

In our court case, even if you vote “guilty” you are never absolutely sure that the accused committed the crime. You don’t need to be sure, just believe guilt “beyond a reasonable doubt”, not beyond all doubt. It’s the same in the coin toss. Even if you get 1000 Heads on 1000 tosses (an event with probability 9.3326…e-302 (that’s 301 zeros before the 93326…) you are not absolutely sure the coin is biased. Tiny is not zero. Miniscule is not zero. Thus you have to draw the line somewhere. It is common in statistical tests to use the number 5% as a level of significance. This is the error rate we are willing to accept when rejecting 𝐻_0. We call this number 𝛼 (alpha) and say that

the level of significance ≡ 𝛼 ≡ 𝑡𝑕𝑒 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑟𝑒𝑗𝑒𝑐𝑡𝑖𝑛𝑔 𝑎 𝑡𝑟𝑢𝑒 𝑛𝑢𝑙𝑙 𝑕𝑦𝑝𝑜𝑡𝑕𝑒𝑠𝑖𝑠

If you would like your level of significance to be zero you haven’t been paying attention. The only way to ensure that no innocent people go to jail is to create a system in which absolutely everyone is set free. That would be chaos.

On the other hand, we need to control our Type II error rate as well. Since we call the probability of committing a Type I error (reject true null) 𝛼, you won’t be surprised that we call the probability of committing a Type II error (accept false null) 𝛽. Type II errors are a little harder to deal with and we pretty much ignore them in this course. I can’t leave the topic before mentioning that people define the power of a statistical test as the probability we reject a null hypothesis when it is false.

Once we know what we want to test (𝑤𝑒 𝑘𝑛𝑜𝑤 𝐻 0 𝑎𝑛𝑑 𝐻_1 ) and we know how often we are willing to make a mistake in the direction of rejecting a true null ( 𝑤𝑒 𝑘𝑛𝑜𝑤 𝛼 ) we must consider some sort of

evidence. To do this in a statistical test we compute some relevant statistic. In our coin toss example we should calculate the probability that we would get as many as 540 Heads on 1000 tosses with a fair coin. If this probability is too small (typically if it is less that 5%) we will conclude that the coin is not fair. If you think about it for a moment, we would consider a coin that delivered too few Heads to be biased as well. Thus we will calculate the probability that a fair coin would give a result as extreme as 540 Heads. Use the normal approximation to the binomial random variable and compute as follows:

 Convert the target value to standard units

𝑧 = 𝑝 −^ 𝑝

= 0.54^ −^ 0.

 Determine how likely it is to see a data point (statistic) this far away from the hypothesized population parameter.

𝑃 −2.5298 < 𝑧 < 2.5298 ≈ 𝑃 −2.53 < 𝑧 < 2.53 = 0.9943 − 0.0057 = 0.

So, the probability of observing a test statistic as extreme as ours is 1 − 0.9886 = 0.0114. Not impossible, but pretty unlikely. In fact we agreed to reject our null hypothesis if we saw data that was less likely that 5%. Since 0.0114 < 0.05 we reject the null hypothesis and conclude that the coin is not fair.

Notice that I was sneaky and made you compute a 𝑝-value without even telling you about it. We define a 𝑝-value as the probability of observing data as extreme as the data you collected. Note that we reject our null hypothesis when we observe unlikely data, thus

We reject the null hypothesis when our 𝒑 -value is smaller than 𝜶.

Assume for the sake of the test that 𝐻 0 : 𝜇𝐹𝑖𝑏𝑟𝑜 = 50_. Since we feel that these folks will tend to have a lower Life Satisfaction and we wish to refute the null hypothesis, we will conduct a 1 tailed test and take as our alternative_ 𝐻 1 : 𝜇𝐹𝑖𝑏𝑟𝑜 < 50_. We will then reject the null hypothesis if our data are “too low”. Make sure you read about one versus two tailed tests in your textbook!_

Decide upon a level of significance, 𝛼.

When we do this we really aren’t doing math anymore, just trying to proceed as best we reasonably can. It is very common to take 𝛼 = 0.05_. It is also common to use_ 𝛼 = 0.. To see this in action we will use this smaller value here.

Compute a test statistic (𝑧, 𝑡, 𝜒^2 , 𝑎𝑛𝑑 𝐹 are popular stats).

Since we are dealing with means and with large sample sizes we will compute a 𝑧 statistic. Remember that

𝑧 = 𝑥 − 𝜇𝜎 𝑛

= 1048 −^50

Find the 𝑝-value corresponding to your test statistic (for a left tailed, a right tailed, or a two

tailed test). We find the probability that 𝑧 < -2.28. From our z table I get p=0.0113. Since this number is not less than 𝛼 = 0.01 we may not reject the null hypothesis at the 𝛼 = 0.01 level of significance. (Note that if we had been less picky we could have rejected at the 𝛼 = 0.05 level of significance. It all depends upon where you initially set up the goal posts.

Form a conclusion: if 𝑝 < 𝛼 (improbable data) reject 𝐻 0 , otherwise do not reject. We never

accept, just like the courts never say that someone is innocent.

We were not able to reject the null hypothesis at the stated level of significance. These data are insufficient to conclude that individuals with Fibromyalgia have a lower Life Satisfaction than the general population.

Here is your first presentation problem of the week. Suppose we had instead used 𝒏 = 𝟐𝟓 for a sample size and did not know the standard deviation but instead had to estimate it with 𝒔 = 𝟔. If we had obtained 𝒙 = 𝟒𝟒 would we be able to conclude 𝑯𝟏: 𝝁𝑭𝒊𝒃𝒓𝒐 < 50 at the 𝜶 = 𝟎. 𝟎𝟓 level of significance? This is called a t-test and is the subject of the next lecture.