




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concept of lurking variables in econometric models, which are omitted variables that should be included to improve model accuracy. The authors explain how synergistic variables, which have a greater impact when combined with other variables, can lead to misleading results if not accounted for. The paper provides examples and benefits for understanding synergistic variables and their role in econometric modeling.
Typology: Study notes
1 / 8
This page cannot be seen from the preview
Don't miss anything!
2
2
2
regression analysis. Multicollinearity is desirable for synergistic variables but
becomes a problem for other variables [3]. This concept is explained below.
SYNERGISTIC VARIABLES
A synergistic variable is defined as a variable whose partial r
2 value is
greater than its simple r 2 value [4]. A variable is also considered synergistic if it
possesses a significant partial F value and an insignificant simple r
2 value.
Synergistic variables must be used in a combination with other variables. Several
articles have illustrated synergism in regression [4][9]. Kendall and Stuart [5], show
that given r (^) y.jr (^) y.k > 0, if variables X (^) j and X (^) k are inversely related (r (^) j.k < 0) or if
r (^) j.k > 2r (^) y.jr (^) y.k /(r (^) .j + r (^) .k), (1)
then the variables are synergistic; R^2 > r (^) .j + r (^) .k. They also give the conditions for
identifying synergism when r (^) y.jr (^) y.k < 0.
Daniel and Wood [2] and Freund [3] show that synergism in regression is a
function of multicollinearity. As multicollinearity increases among the regressors, the
importance of the regressors may also increase; this increases the partial F values and
decreases the residual mean square. This concept is illustrated in Figure 1 for the
regression model
Y
^ = b 0 + b 1 X 1 + b (^) 2X (^) 2. (2)
Multiple R^2 is measured on the vertical axis and the correlation or multicollinearity
between X 1 and X 2 (r (^) 1.2) is measured on the horizontal axis. Given specific values for
r (^) y.1 and r (^) y.2 and starting at the extreme negative point for r (^) 1.2 , multiple R^2 decreases as
r (^) 1.2 increases between X 1 and X (^) 2. This continues throughout the permissible range for
r (^) 1.2 until R^2 reaches its minimum after which it increases. Hence, multicollinearity is
desirable if X 1 and X 2 are inversely related (r (^) 1.2 < 0) or if r (^) 1.2 is greater than (1)
given r (^) y.1r (^) y.1 > 0. These values are where X 1 and X 2 become synergistic.
FINANCIAL ANALYSIS EXAMPLE
Stock prices (Y) are a function of annual return on investment and
anticipated growth. The data in Table 1 was obtained from a sample of 35 companies
in Dun's Review. The types of variables employed in describing stock prices are
listed below.
X (^) 1: yield = (Dividend + Price Change)/Current Price X2: Dividends
X (^) 3: Earnings per share X (^) 4: Sales X5: Income
X (^) 6: Return on sales X 7 : Return on equity (ROE) X (^) 8: Exchange
traded
ANALYSIS
Table 2 gives the computer output for two scenarios. Assume the analyst
excludes variable X and computes the initial least squares regression model with the
following variables:
Y = f(X X 3 X 4 X 5 X 6 X 7 X 8 ) R
2 = 0.53 (3)
Using stepwise regression with significant level at 0.10, the above model reduces to
Y = f(X 3 X (^) 4) R
2 = 0.46 (4)
By excluding X, both of the above models are underspecified.
Table 2 Partial F Measures In Regression Models
(a) Y = f(X 1 X 2 X 3 X 4 X 5 X 6 X 7 X) R = 0.
Variable X 1 X 2 X 3 X 4 X 5 X 6 X 7 X
partial F 6.15 30.86 25.50 2.45 1.67 4.67 0.24 1. p-value 0.020 0.001 0.001 0.130 0.208 0.040 0.626 0.
partial F 7.05 34.86 24.01 2.47 3. p-value 0.013 0.001 0.001 0.127 0.
partial F 12.39 36.5 23. p-value 0.0014 0.0001 0.
partial F 0.26 17.49 2.88 1.81 1.20 1.33 1. p-value 0.617 0.001 0.101 0.190 0.283 0.259 0.
Econometric Models
When X is included, the stepwise regression algorithm reduces the initial model
Y = f(X X X X X X X X) R = 0.79 (5) to Y = f(X 1 X 2 X (^) 3) R^2 = 0.73 (6)
The adjusted R^2 value for (5) and (6) are 0.722 and 0.70. Table 3 reveals that
variables (X X X X) possess significant simple r-value. However, a variable is
deleted from a regression model because it is either unrelated with Y or it is
multicollinear with other X-variables (partial duplication of data) and not needed.
The latter reason is why variables X and X are deleted from (5). Although variables
X and X possess significant simple r-values, their multicollinearity causes them to
become insignificant and deleted from the model.
Table 3 Correlation Coefficient Matrix (Bottom Diagonal)
Variable Y X1 X2 X3 X4 X5 X6 X
Y 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
X1 - 0.442 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
X2 0.191 0.597 1.000 xxxxx xxxxx xxxxx xxxxx xxxxx
X3 0.633 0.033 0.348 1.000 xxxxx xxxxx xxxxx xxxxx
X4 0.425 0.033 0.443 0.320 1.000 xxxxx xxxxx xxxxx
X5 0.383 0.132 0.512 0.309 0.967 1.000 xxxxx xxxxx
X6 -0.096 0.256 0.048 -0.107 -0.270 -0.089 1.000 xxxxx
X7 0.073 -0.260 -0.289 -0.105 -0.118 -0.082 0.289 1.
X8 0.034 0.184 0.311 -0.083 0.204 0.221 -0.192 -0.
The Least Significant Absolute Value for simple r equals 0. (LSV-r = 0.254). All r-values below 0.254 are insignificant.
By employing the stepwise regression algorithm, (6) is obtained from (5).
By employing the all possible regressions algorithm on (5), the following model is
obtained;
Y = f(X X X X X) R = 0.76 (7)
The adjusted R
2 value for (7) is 0.721. The partial F-value for X is marginal at 2.47;
p-value = 0.13. However, X is a synergistic variable and only becomes significant
when X 2 and X 4 are included in the model. Thus, the identification of synergistic
combinations and a knowledge of the subject matter is used in selecting regression
model (7) over model (6).
Econometric Models
times autocorrelation and violations such as normality and homoskedasticity may be
correctly viewed as underspecification error. Kraner [7], also Maddala [8] have
excellent expositions on misspecification errors. Belsley [1] argues for the use of
prior information in specification analysis. Certainly, econometric models must be
developed by people well grounded in economic theory and a firm knowledge of the
subject matter.
Finally, in the preliminary research phase, always obtain a further knowledge
of the subject matter. Take time to identify influential variables. In the analysis
phase of the research, if intuition strongly indicates that a seemingly unrelated
variable should be related to Y, return to the preliminary research phase. Search for
additional variables so that the proper synergistic combination is included in the
regression model.
REFERENCES
Besley, D.A. (1986), Model Selection in Regression Analysis, Regression Diagnostics
and Prior Knowledge," International Journal of Forecasting 2, 41-6, and commentary 46-52.
Daniel, C., & Wood F.S. (2000). Fitting Equations to Data , 3nd.ed., New York: John
Wiley.
Freund, R.J. (1988), "When is R
2
r (^) y
2 .1 + r^ y
2 .2 (Revisited),"^ The American Statistician , 42, 89-90.
Hamilton, David (1987), "Sometimes R
2
r (^) y
2 .1 + r^
2 y.2: Correlated Variables Are Not Always Redundant," The American Statistician , 41, 129-132. Kendall, M.G. and Stuart, A. (1973), The Advanced Theory of Statistics , (Vol. 2, 3rd. ed.). New York: Hafner Publishing.
Kennedy, Peter (1998), A Guide To Econometrics, 4th ed., Cambridge, Mass., The
MIT Press
Kramer, w. (1985), "Diagnostic Checking in Practice", Review of Economics and
Statistics 67, 118-23.
Maddala, G.S. (1995), "Specification Tests in Limited Dependent Variable Models",
Advances in Econometrics and Quantitative Economics , Oxford, Blackwell, 1-49.
Mitra, S. (1988), "The relationship between the multiple and the zero- order
correlation coefficients," The American Statistician , 42, 89. Ryan, T.P. (1997), Modern Regression Methods , New York, Wiley.