



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Instructions for programming logistic regression using iterated weighted least squares (IWLS) and ordered logit/probit models in R. It includes explanations of the methods, programming hints, and problems to solve. The document also covers the concept of ordered logit and probit models, their梦境分布和对数似然函数, and how to calculate individual-category probabilities.
Typology: Exercises
1 / 7
This page cannot be seen from the preview
Don't miss anything!
(a) Set the regression coefficients to initial values, such as β(0)^ = 0 (where the superscript 0 indicates start values). (b) At each iteration t calculate the current fitted probabilities μ, variance-function values ν, working-response values z, and weights w:
μ( i t) = [1 + exp(−η( it ))]−^1 v( i t) = μ( it )(1 − μ( i t)) z( i t) =^ η( it )+ (yi −^ μ( i t))/v i(t) w( i t) = nivi
Here, ni represents the binomial denominator for the ith observation; for binary data, all of the ni are 1. (c) Regress the working response on the predictors by weighted least squares, minimizing the weighted residual sum of squares X^ n
i=
w( i t)(z( i t)− x^0 iβ)^2
where x^0 i is the ith row of the model matrix. (d) Repeat steps 2 and 3 until the regression coefficients stabilize at the maximum-likelihood estimator βb. (e) Calculate the estimated asymptotic covariance matrix of the coefficients as
V^ b(βb) = (X^0 WX)−^1
where W = diag{wi} is the diagonal matrix of weights from the last iteration and X is the model matrix.
Problem: Program this method in R. The function that you define should take (at least) three arguments: The model matrix X; the response vector of observed proportions y; and
the vector of binomial denominators n. I suggest that you let n default to a vector of 1 s (i.e., for binary data, where y consists of 0 s and 1 s), and that you attach a column of 1 s to the model matrix for the regression constant so that the user does not have to do this when the function is called. Programming hints:
Yi =
1 if ξi ≤ α 1 2 if α 1 < ξi ≤ α 2 .. . m − 1 if αm− 2 < ξi ≤ αm− 1 m if αm− 1 < ξi
The thresholds, regions, and corresponding values of ξ and Y are represented graphically in the following figure. Notice that the thresholds are not in general uniformly spaced.
and thus
Pr(Yi > j) = Λ (αj + β 1 Xi 1 + · · · + βkXik) , j = 1,... , m − 1 (3) = Λ
αj +^ x^0 iβ
where Λ (·) is the cumulative logistic distribution. In this parametrization, the intercepts αj are the negatives of the category thresholds. The ordered probit model is similar with
Pr(Yi > j) = Φ (αj + β 1 Xi 1 + · · · + βkXik) , j = 1,... , m − 1 (5) = Φ
αj + x^0 iβ
where Φ (·) is the cumulative normal distribution.
The log-likelihood under both the ordered logit and ordered probit model takes the following form:
loge L(α, β) =
X^ n
i=
loge πw i 1 i 1 πw i 2 i 2 · · · πw imim
where α is an m− 1 × 1 vector containing all of the regression constants and β is the k× 1 vector containing the other regression coefficients; πij = Pr(Yi = j) (i.e., the probability under the model that individual i is in response category j); and the wij are indicator variables equal to 1 if individual i is observed in category j and 0 otherwise. Thus, for each individual, only one of the wij is equal to 1 and only the corresponding πij contributes to the likelihood. Note that for either the ordered logit model or the ordered probit model, the individual-category probabilities can be computed as differences between adjacent cumulative probabilities from Equation 3 or 5 (which are functions of the parameters):
πi 1 = 1 − Pr(Yi > 1) πi 2 = Pr(Yi > 1) − Pr(Yi > 2) .. . πim = Pr(Yi > m − 1)
Problem: Program the ordered-logit model or the ordered-probit model (or both). The function that you define should take (at least) two arguments: The model matrix X and the response vector y, which should be a factor or ordered factor; I suggest that you attach a column of 1 s to the model matrix for the regression constants so that the user does not have to do this when the function is called; the ordered logit and probit models always have constants. Your function can include an argument to indicate which model – logit or probit
Programming hints:
Pr(Yi > j) = Λ
αj + β 1 j Xi 1 + · · · + βkj Xik
, j = 1,... , m − 1
or Pr(Yi > j) = Φ
αj + β 1 j Xi 1 + · · · + βkj Xik
, j = 1,... , m − 1 Program one or the other (or both) of these models. For your example regression, use a likelihood-ratio test to compare the more general cumulative logit or probit model to the more restrictive ordered logit or probit model of the preceding problem. This test checks the assumption of equal slopes. The cumulative logit and probit models (along with the ordered logit and probit models) can be fit by the vglm() function in the VGAM package.
(a) All of its nonzero rows (if any) precede all of its zero rows (if any). (b) The first entry (from left to right) – called the leading entry – in each nonzero row is
(c) The leading entry in each nonzero row after the first is to the right of the leading entry in the previous row. (d) All other entries are 0 in a column containing a leading entry.
A matrix can be put into reduced row echelon form by a sequence of elementary row opera- tions, which are of three types:
(a) Multiply each entry in a row by a nonzero constant.
The matrix is now in reduced row-echelon form. The rank of a matrix is the number of nonzero rows in its reduced row-echelon form, and so the matrix in this example is of rank 2. Problem: Write an R function to calculate the reduced row-echelon form of a matrix by elimination. Programming hints: