






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A lecture note from stat 9220, a biostatistics course at the medical college of georgia. It covers the concepts of hypothesis tests, statistical errors, p-values, and confidence sets in statistical inference. The importance of setting a level of significance, the difference between type i and type ii errors, and how to calculate p-values and confidence intervals.
Typology: Exams
1 / 11
This page cannot be seen from the preview
Don't miss anything!
To test the hypotheses H 0
0
versus H 1
1
, there are two types of
statistical errors we may commit: rejecting H 0
when H 0
is true (called the type
I error) and accepting H 0
when H 0
is wrong (called the type II error). Let T be
a test which is a statistic from X to { 0 , 1 }. Probabilities of making two types of
errors:
α T
0
(type I)
and
1 − αT (P ) = P (T (X) = 0), P ∈ P 1
Optimal decision rule does not exist. Therefore we assign a small bound to α T
and minimize 1 − α T
(P ) for P ∈ P 1
, subject to α T
(P ) ≤ α, P ∈ P 0
. The bound
α is called the level of significance
α = sup
P ∈P 0
α T
The choice of a level of significance α is usually somewhat subjective. In
most applications there is no precise limit to the size of T that can be tolerated.
Standard values, such as 0. 10 , 0. 05 , or 0.01, are often used for convenience. Often
we only impose bound on the significance level e.g.,
sup
P ∈P 0
α T
(P ) ≤ α. (7.1)
In general a small α leads to a small rejection region.
It is good practice to determine not only whether H 0
is rejected or accepted for a
given α and a chosen test T α
, but also the smallest possible level of significance at
which H 0
would be rejected for the computed T α
(x), i.e.,
αˆ = inf{α ∈ (0, 1) : T α
(x) = 1}.
Hence ˆα or P -value is the smallest possible value of α for which H 0
would be
rejected for computed T (x). Hence ˆα is a statistic.
Example 7.2.1. Consider the problem in the previous example. Let us calculate
the p-value for T cα
(c = c α
). Note that
α = 1 − Φ
n(c α
− μ 0
σ
n(¯x − μ 0
σ
if and only if ¯x > c (or T cα
(x) = 1). Hence
n(¯x − μ 0
σ
= inf{α ∈ (0, 1) : T cα
(x) = 1} = ˆα
is the p-value for Tc α
Example 7.2.2. Consider the problem in Example 7.1.1. Let us calculate the
P -value for T cα
. Note that
α = 1 − Φ
n(c α
− μ 0
σ
n(¯x − μ 0
σ
if and only if ¯x > c α
(or T cα
(x) = 1). Hence
n(c α
− μ 0
σ
= inf{α ∈ (0, 1) : T cα
(x) = 1} = ˆα(x).
is the P -value for Tc α
. Thus it turns out that the decision in the testing problem
may be written more concisely as
cα
(x) = I (0,α)
(ˆα(x)).
With the additional information provided by P -values, using P -values is typically
more appropriate than using fixed-level tests in a scientific problem.
However, a fixed level of significance is unavoidable when acceptance or rejection
of H 0 implies an imminent concrete decision.
Example 7.3.1. Assume that the sample X has the binomial distribution Bi(θ, n)
with an unknown θ ∈ (0, 1) and a fixed integer n > 1. Consider the hypotheses
0
: θ ∈ (0, θ 0
] versus H 1
: θ ∈ (θ 0
, 1), where θ 0
∈ (0, 1) is a fixed value.
Consider the following class of randomized tests:
j,q
1 X > j
q X = j
0 X < j,
where j = 0, 1 ,... , n − 1 and q ∈ [0, 1]. Then
α Tj,q
(θ) = P (X > j) + qP (X = j) 0 < θ ≤ θ 0
and
1 − α Tj,q
(θ) = P (X < j) + (1 − q)P (X = j) θ 0
< θ < 1.
It can be shown that for any α ∈ (0, 1), there exist an integer j and q ∈ (0, 1) such
that the size of T j,q
is α.
Definition 7.4.1. Let ϑ be a real-valued parameter related to the unknown pop-
ulation P ∈ P and C(X) ∈ B˜ Θ
, where
Θ ∈ B is the range of ϑ. If
inf
P ∈P
P (ϑ ∈ C(X)) ≥ 1 − α (7.2)
where α ∈ (0, 1) is fixed, then C(X) is called a confidence set for ϑ with level of
significance 1 − α, and 1 − α is called the confidence coefficient of C(X). If (??)
holds, then the coverage probability of C(X) is at least 1−α, although C(x) either
covers or does not cover ϑ whence we observe X = x. If C(X) = [ϑ(X),
ϑ(X)] for
a pair of real-valued statistics ϑ and
ϑ, then C(X) is called a confidence interval
for ϑ.
Example 7.4.1. In the setup of previous examples, consider the confidence inter-
val for ϑ = μ. It is enough to consider C(
X) since
X is sufficient. Note that
P (μ ∈ [
X − c,
X + c]) = P (|
X − μ| ≤ c) = 1 − 2Φ(−
nc/σ),
which is independent of μ. Hence, the confidence coefficient of [
X − c,
X + c] is
nc/σ).
Hence, the confidence coefficient of P (μ ∈ [
X − c,
X + c]) is 1 − 2Φ(−
nc/σ),
which is an increasing function of c and converges to 1 as c → ∞ or 0 as c → 0.
Example 7.4.2. Let X 1
n
be i.i.d. from the N (μ, σ
2 ) distribution with both
μ ∈ R and σ
2
0 unknown. Let θ = (μ, σ
2
) and α ∈ (0, 1) be fixed. Let
X be
the sample mean and S
2 be the sample variance. Since (
2 ) is sufficient for θ,
we focus on C(X) that is a function of (
2
).
X and S
2
are independent and
(n − 1)S
2 /σ
2 ∼ X
2 (n − 1). Since
n(
X − μ)/σ ∼ N (0, 1),
X − μ
σ/
n
| ≤ ˜c α
1 − α
and
P (c 1 α
(n − 1)S
2
σ
2
≤ c 2 α
1 − α
using X
2
(n−1)
distribution to find c 1 α
, c 2 α
. Hence,
P (−˜c α
X − μ)
n
σ
≤ ˜c α
, c 1 α
(n − 1)S
2
σ
2
≤ c 2 α
) = 1 − α
(by independence)
n(
X − μ)
2
c ˜
2
α
≤ σ
2
,
(n − 1)S
2
c 2 α
≤ σ
2
≤
(n − 1)S
2
c 1 α
) = 1 − α.
Remark 7.4.2 (Some final rermarks). For a general confidence interval [ϑ(X), ϑ(X)],
its length is ϑ(X) − ϑ(X), which may be random.
We may consider the expected (or average) length E[ϑ(X) − ϑ(X)].
The confidence coefficient and expected length are a pair of good measures of
performance of confidence intervals.
Like the two types of error probabilities of a test in hypothesis testing, however, we
cannot maximize the confidence coefficient and minimize the length (or expected
length) simulta- neously.
A common approach is to minimize the length (or expected length) subject to
For an unbounded confidence interval, its length is ∞.
The idea of confidence pictures is becoming recently much more popular due to
relatively easy access to computationally intense graphical tools (e.g. density es-
timators or level-plots).