Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Lurking Variables in Econometric Models: Understanding Synergistic Effects, Study notes of Statistics

The concept of lurking variables in econometric models, which are omitted variables that should be included to improve model accuracy. The authors explain how synergistic variables, which have a greater impact when combined with other variables, can lead to misleading results if not accounted for. The paper provides examples and benefits for understanding synergistic variables and their role in econometric modeling.

Typology: Study notes

2021/2022

Uploaded on 09/27/2022

norris
norris 🇬🇧

4

(5)

212 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Lurking Variable in
Econometric Models
145
LURKING VARIABLES IN ECONOMETRIC MODELS
Frank G. Landram, West Texas A & M University
Amjad Abdullat,West Texas A & M University
ABSTRACT
Lurking variables are omitted variables that should be included in the
regression model. If the lurking variable is part of a synergistic combination, the
effects it has on a regression model are magnified. This paper illustrates the
seriousness of omitting a variable that is part of a synergistic combination. When this
happens, synergistic variables in the regression model act as if they are unrelated with
the dependent variable. This drastically reduces the model's effectiveness and can
lead to misleading results. An awareness of synergistic variables and their possible
effect on underspecified regression models enable analysts to become more proficient
in econometric modeling.
INTRODUCTION
Little research is available concerning the possible effects of excluding
relevant variables which are part of a synergistic combination [4]. Although these
effects are known in many circles by oral tradition and intuition, they have not been
documented. Moreover, the results of excluding these types of variables are usually
misleading.
Using obvious notations, this paper illustrates the seriousness of lurking
variables in regression [10]. If the lurking variable is synergistic, the seriousness is
magnified. Synergism in regression enables multiple R2 to become greater than the
sum of the simple r2 coefficients; R2 > Γr.j. These regressors which are seemingly
unrelated with Y become significant when combined with other variable(s). Using
empirical data, an example is given which illustrates the misleading results when
synergistic variables are omitted from the regression model. Concluding remarks are
given which will enhance ones ability in model building.
Benefits
This paper benefits readers by showing (a) the possible misleading results of
lurking variables and underspecified models. This encourages analysts to become
more diligent in their preliminary research procedures. (b) Readers are given a
strong awareness that they should not always accept statistical tests as being
definitive. If intuition and a knowledge of the subject matter indicate that an
insignificant regressor is in fact related with Y, search for additional influential
variables. The regressor in question may be a synergistic variable needing another
regressor to complete the synergistic combination. (c) Variable selection algorithms
can not replace a knowledge of the subject matter and must be used with caution.
When the regression model is underspecified, variable selection algorithms usually
magnify the problem. (d) A knowledge of synergism also enhances one s skills in
pf3
pf4
pf5
pf8

Partial preview of the text

Download Lurking Variables in Econometric Models: Understanding Synergistic Effects and more Study notes Statistics in PDF only on Docsity!

Lurking Variable in

Econometric Models

LURKING VARIABLES IN ECONOMETRIC MODELS

Frank G. Landram, West Texas A & M University

Amjad Abdullat,West Texas A & M University

ABSTRACT

Lurking variables are omitted variables that should be included in the

regression model. If the lurking variable is part of a synergistic combination, the

effects it has on a regression model are magnified. This paper illustrates the

seriousness of omitting a variable that is part of a synergistic combination. When this

happens, synergistic variables in the regression model act as if they are unrelated with

the dependent variable. This drastically reduces the model's effectiveness and can

lead to misleading results. An awareness of synergistic variables and their possible

effect on underspecified regression models enable analysts to become more proficient

in econometric modeling.

INTRODUCTION

Little research is available concerning the possible effects of excluding

relevant variables which are part of a synergistic combination [4]. Although these

effects are known in many circles by oral tradition and intuition, they have not been

documented. Moreover, the results of excluding these types of variables are usually

misleading.

Using obvious notations, this paper illustrates the seriousness of lurking

variables in regression [10]. If the lurking variable is synergistic, the seriousness is

magnified. Synergism in regression enables multiple R

2

to become greater than the

sum of the simple r

2

coefficients; R

2

> Γr .j. These regressors which are seemingly

unrelated with Y become significant when combined with other variable(s). Using

empirical data, an example is given which illustrates the misleading results when

synergistic variables are omitted from the regression model. Concluding remarks are

given which will enhance ones ability in model building.

Benefits

This paper benefits readers by showing (a) the possible misleading results of

lurking variables and underspecified models. This encourages analysts to become

more diligent in their preliminary research procedures. (b) Readers are given a

strong awareness that they should not always accept statistical tests as being

definitive. If intuition and a knowledge of the subject matter indicate that an

insignificant regressor is in fact related with Y, search for additional influential

variables. The regressor in question may be a synergistic variable needing another

regressor to complete the synergistic combination. (c) Variable selection algorithms

can not replace a knowledge of the subject matter and must be used with caution.

When the regression model is underspecified, variable selection algorithms usually

magnify the problem. (d) A knowledge of synergism also enhances ones skills in

Southwestern Economic Review

regression analysis. Multicollinearity is desirable for synergistic variables but

becomes a problem for other variables [3]. This concept is explained below.

SYNERGISTIC VARIABLES

A synergistic variable is defined as a variable whose partial r

2 value is

greater than its simple r 2 value [4]. A variable is also considered synergistic if it

possesses a significant partial F value and an insignificant simple r

2 value.

Synergistic variables must be used in a combination with other variables. Several

articles have illustrated synergism in regression [4][9]. Kendall and Stuart [5], show

that given r (^) y.jr (^) y.k > 0, if variables X (^) j and X (^) k are inversely related (r (^) j.k < 0) or if

r (^) j.k > 2r (^) y.jr (^) y.k /(r (^) .j + r (^) .k), (1)

then the variables are synergistic; R^2 > r (^) .j + r (^) .k. They also give the conditions for

identifying synergism when r (^) y.jr (^) y.k < 0.

Daniel and Wood [2] and Freund [3] show that synergism in regression is a

function of multicollinearity. As multicollinearity increases among the regressors, the

importance of the regressors may also increase; this increases the partial F values and

decreases the residual mean square. This concept is illustrated in Figure 1 for the

regression model

Y

^ = b 0 + b 1 X 1 + b (^) 2X (^) 2. (2)

Multiple R^2 is measured on the vertical axis and the correlation or multicollinearity

between X 1 and X 2 (r (^) 1.2) is measured on the horizontal axis. Given specific values for

r (^) y.1 and r (^) y.2 and starting at the extreme negative point for r (^) 1.2 , multiple R^2 decreases as

r (^) 1.2 increases between X 1 and X (^) 2. This continues throughout the permissible range for

r (^) 1.2 until R^2 reaches its minimum after which it increases. Hence, multicollinearity is

desirable if X 1 and X 2 are inversely related (r (^) 1.2 < 0) or if r (^) 1.2 is greater than (1)

given r (^) y.1r (^) y.1 > 0. These values are where X 1 and X 2 become synergistic.

FINANCIAL ANALYSIS EXAMPLE

Stock prices (Y) are a function of annual return on investment and

anticipated growth. The data in Table 1 was obtained from a sample of 35 companies

in Dun's Review. The types of variables employed in describing stock prices are

listed below.

X (^) 1: yield = (Dividend + Price Change)/Current Price X2: Dividends

X (^) 3: Earnings per share X (^) 4: Sales X5: Income

X (^) 6: Return on sales X 7 : Return on equity (ROE) X (^) 8: Exchange

traded

Southwestern Economic Review

ANALYSIS

Table 2 gives the computer output for two scenarios. Assume the analyst

excludes variable X and computes the initial least squares regression model with the

following variables:

Y = f(X X 3 X 4 X 5 X 6 X 7 X 8 ) R

2 = 0.53 (3)

Using stepwise regression with significant level at 0.10, the above model reduces to

Y = f(X 3 X (^) 4) R

2 = 0.46 (4)

By excluding X, both of the above models are underspecified.

Table 2 Partial F Measures In Regression Models


(a) Y = f(X 1 X 2 X 3 X 4 X 5 X 6 X 7 X) R = 0.


Variable X 1 X 2 X 3 X 4 X 5 X 6 X 7 X


partial F 6.15 30.86 25.50 2.45 1.67 4.67 0.24 1. p-value 0.020 0.001 0.001 0.130 0.208 0.040 0.626 0.


(b) Y = f(X 1 X 2 X 3 X 4 X) R = 0.

Variable X 1 X 2 X 3 X 4 X 6

partial F 7.05 34.86 24.01 2.47 3. p-value 0.013 0.001 0.001 0.127 0.


(c) Y = f(X 1 X 2 X3) R = 0.

Variable X 1 X 2 X 3

partial F 12.39 36.5 23. p-value 0.0014 0.0001 0.


(d) Y = f(X 2 X 3 X 4 X 5 X 6 X 7 X) R = 0.

Variable X 2 X 3 X 4 X 5 X 6 X 7 X 8

partial F 0.26 17.49 2.88 1.81 1.20 1.33 1. p-value 0.617 0.001 0.101 0.190 0.283 0.259 0.


Lurking Variable in

Econometric Models

When X is included, the stepwise regression algorithm reduces the initial model

Y = f(X X X X X X X X) R = 0.79 (5) to Y = f(X 1 X 2 X (^) 3) R^2 = 0.73 (6)

The adjusted R^2 value for (5) and (6) are 0.722 and 0.70. Table 3 reveals that

variables (X X X X) possess significant simple r-value. However, a variable is

deleted from a regression model because it is either unrelated with Y or it is

multicollinear with other X-variables (partial duplication of data) and not needed.

The latter reason is why variables X and X are deleted from (5). Although variables

X and X possess significant simple r-values, their multicollinearity causes them to

become insignificant and deleted from the model.

Table 3 Correlation Coefficient Matrix (Bottom Diagonal)


Variable Y X1 X2 X3 X4 X5 X6 X


Y 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

X1 - 0.442 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx

X2 0.191 0.597 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx

X3 0.633 0.033 0.348 1.000 xxxxx xxxxx xxxxx xxxxx

X4 0.425 0.033 0.443 0.320 1.000 xxxxx xxxxx xxxxx

X5 0.383 0.132 0.512 0.309 0.967 1.000 xxxxx xxxxx

X6 -0.096 0.256 0.048 -0.107 -0.270 -0.089 1.000 xxxxx

X7 0.073 -0.260 -0.289 -0.105 -0.118 -0.082 0.289 1.

X8 0.034 0.184 0.311 -0.083 0.204 0.221 -0.192 -0.


The Least Significant Absolute Value for simple r equals 0. (LSV-r = 0.254). All r-values below 0.254 are insignificant.

By employing the stepwise regression algorithm, (6) is obtained from (5).

By employing the all possible regressions algorithm on (5), the following model is

obtained;

Y = f(X X X X X) R = 0.76 (7)

The adjusted R

2 value for (7) is 0.721. The partial F-value for X is marginal at 2.47;

p-value = 0.13. However, X is a synergistic variable and only becomes significant

when X 2 and X 4 are included in the model. Thus, the identification of synergistic

combinations and a knowledge of the subject matter is used in selecting regression

model (7) over model (6).

Lurking Variable in

Econometric Models

times autocorrelation and violations such as normality and homoskedasticity may be

correctly viewed as underspecification error. Kraner [7], also Maddala [8] have

excellent expositions on misspecification errors. Belsley [1] argues for the use of

prior information in specification analysis. Certainly, econometric models must be

developed by people well grounded in economic theory and a firm knowledge of the

subject matter.

Finally, in the preliminary research phase, always obtain a further knowledge

of the subject matter. Take time to identify influential variables. In the analysis

phase of the research, if intuition strongly indicates that a seemingly unrelated

variable should be related to Y, return to the preliminary research phase. Search for

additional variables so that the proper synergistic combination is included in the

regression model.

REFERENCES

Besley, D.A. (1986), Model Selection in Regression Analysis, Regression Diagnostics

and Prior Knowledge," International Journal of Forecasting 2, 41-6, and commentary 46-52.

Daniel, C., & Wood F.S. (2000). Fitting Equations to Data , 3nd.ed., New York: John

Wiley.

Freund, R.J. (1988), "When is R

2

r (^) y

2 .1 + r^ y

2 .2 (Revisited),"^ The American Statistician , 42, 89-90.

Hamilton, David (1987), "Sometimes R

2

r (^) y

2 .1 + r^

2 y.2: Correlated Variables Are Not Always Redundant," The American Statistician , 41, 129-132. Kendall, M.G. and Stuart, A. (1973), The Advanced Theory of Statistics , (Vol. 2, 3rd. ed.). New York: Hafner Publishing.

Kennedy, Peter (1998), A Guide To Econometrics, 4th ed., Cambridge, Mass., The

MIT Press

Kramer, w. (1985), "Diagnostic Checking in Practice", Review of Economics and

Statistics 67, 118-23.

Maddala, G.S. (1995), "Specification Tests in Limited Dependent Variable Models",

Advances in Econometrics and Quantitative Economics , Oxford, Blackwell, 1-49.

Mitra, S. (1988), "The relationship between the multiple and the zero- order

correlation coefficients," The American Statistician , 42, 89. Ryan, T.P. (1997), Modern Regression Methods , New York, Wiley.

Southwestern Economic Review