If individual effect
(cross-sectional or time specific effect) does
not exist \(\left( {{u}_{i}}=0 \right)\) , OLS produces efficient and
consistent parameter estimates;
\({{y}_{it}}={{\beta }_{0}}+{{\beta
}_{1}}{{x}_{it}}+{{u}_{i}}+{{v}_{it}}\) (1)
and we assumed that \(\left(
{{u}_{i}}=0 \right)\) .
OLS consists of five
core assumptions (Greene,2008; Kennedy,2008)
o Linearity – the model is
linear function.
o Exogeneity – expected
value of disturbance is zero or disturbance are not correlated with any
regressor.
o Homoscedasticity &
no autocorrelation.
o Not stochastic for the
independent variable but fixed in repeated samples.
o Full rank – there is no
exact linear relationship among independent variables
There are
several strategies for estimating a fixed effect model; the least squares dummy
variable (LSDV) model, within estimation and between estimation.
Random
Effects (RE) Model
In FE model we had
discuss in here, the estimation goal of FE is to eliminate \({{u}_{i}}\)
because it is thought to be correlated with
one or more of the \({{x}_{it}}\).
But, suppose we
assume \({{u}_{i}}\) is uncorrelated with each explanatory variable in all time
periods. Then using a transformation to eliminate \({{u}_{i}}\)
will results inefficient estimators.
Eq(1) becomes a RE
model when we assume that the unobserved effect \({{u}_{i}}\) is uncorrelated with each explanatory variable;
\(Cov\left( {{x}_{ij}},{{u}_{i}}
\right)=0\) (2)
The ideal RE
assumptions include all the FE assumptions plus the additional requirement that
\({{u}_{i}}\) is independent
of all explainatory variables in all time periods.
If we assume
the unobserved the \({{u}_{i}}\) is correlated with any explanatory variables, we
should use first differencing or FE.
To estimate RE,
we define the composite error term as \({{w}_{it}}={{u}_{i}}+{{v}_{it}}\), then
the Eq (2.1) can be written as;
\({{y}_{it}}={{\beta }_{0}}+{{\beta
}_{1}}{{x}_{it}}+{{w}_{it}}\) (3)
Because
is in the
composite error in each time period, the \({{w}_{it}}\) are serially correlated across time.
Under the RE
assumptions;
\(Corr\left(
{{w}_{it}},{{w}_{is}} \right)=\sigma _{u}^{2}/\left( \sigma _{u}^{2}+\sigma
_{v}^{2} \right),t\ne s\) (4)
where \(\sigma _{u}^{2}=Var\left(
{{u}_{i}} \right)\) and \(\sigma _{v}^{2}=Var\left( {{v}_{it}}
\right)\)
The RE is estimated
by GLS when the covariance structure is known, and by FGLS or EGLS when the
covariance structure of composite error is unknown.
Compared to FE
model, a RE model is relatively difficult to estimate. In FGLS, we first have
to estimate \(\theta \) using \(\hat{\sigma
}_{u}^{2}\) and \(\hat{\sigma
}_{v}^{2}\) .
The \(\hat{\sigma
}_{u}^{2}\) comes from the
between effect estimation (group mean regression) and \(\hat{\sigma }_{v}^{2}\)
is derived from the RSS of the within effect estimation of the deviation of
residuals from group means of residual;
\(\hat{\theta
}=1-\sqrt{\frac{\hat{\sigma }_{v}^{2}}{T\hat{\sigma }_{u}^{2}+\hat{\sigma
}_{v}^{2}}}=1-\sqrt{\frac{\hat{\sigma }_{v}^{2}}{T\hat{\sigma }_{between}^{2}}}\)
(5)
where \(\hat{\sigma }_{u}^{2}=\hat{\sigma
}_{between}^{2}-\frac{\hat{\sigma }_{v}^{2}}{T}\), where \(\hat{\sigma
}_{between}^{2}=\frac{RS{{S}_{between}}}{n-k-1}\) ,
\(\hat{\sigma }_{v}^{2}=\frac{RS{{S}_{within}}}{nT-n-k}=\frac{e'{{e}_{within}}}{nT-n-k}=\frac{\sum\nolimits_{i=1}^{n}{\sum\nolimits_{t=1}^{T}{{{\left(
{{v}_{it}}-{{{\bar{v}}}_{i}} \right)}^{2}}}}}{nT-n-k}\), where
are the residual of the LSDV.
Then, the dependent
variable, independent variables, and the intercept term need to be transformed
as follows;
\(y_{it}^{*}={{y}_{it}}-\hat{\theta
}{{\bar{y}}_{i}}\) (6)
\(x_{it}^{*}={{x}_{it}}-\hat{\theta
}{{\bar{x}}_{i}}\) (7)
\(\beta
_{0}^{*}=1-\hat{\theta }\) (8)
Finally, run OLS on
those transformed variables , Eq(6), (7) and (8) with the traditional
intercept suppressed;
\(y_{it}^{*}=\beta
_{0}^{*}+{{\beta }_{1}}x_{it}^{*}+\varepsilon _{it}^{*}\) (9)
Estimation
using Stata
For our
discussion on the RE using Stata, lets we use the data airline.dta again
as we discuss the FE model in here and we want to estimate the effects of output,
fuel and loading factor to the cost of airline companies;
\(cos{{t}_{it}}={{\beta
}_{0}}+{{\beta }_{1}}outpu{{t}_{it}}+{{\beta }_{2}}fue{{l}_{it}}+{{\beta
}_{3}}loa{{d}_{it}}+{{v}_{it}}\) (10)
where;
\(cos{{t}_{it}}\) =
cost of airline companies
\(outpu{{t}_{it}}\) =
revenue passenger mile (output index)
\(fue{{l}_{it}}\) = fuel prices
\(loa{{d}_{it}}\) = loading factor (average capacity utilization of
the fleet)
Now, lets
us regress the Eq(10) by the pooled OLS
reg cost output
fuel load
Now, lets we regress
the RE model. The estimation of the RE model require that we need to estimate
the Eq(3) first and then get the value of \(\theta \)
manually as in
Eq(5). After that, we need to transform the data based on the value of
as in Eq(6), Eq(7)
and Eq(8) also in manually and then regress the RE model by OLS as in Eq(9).
In Stata, we can
skip the procedure of calculation and estimation manually from Eq(5) through
Eq(9). Thanks to Stata for the command xtreg,re which the Stata estimate the Eq(9)
automatically to get the output in RE estimation.
Before we run the xtreg
command, we need to specifies first the cross-sectional and time series
variables,
xtset airline year
To estimate the RE
model as in Eq(9);
xtreg cost output fuel load,re theta
The sigma_u
and sigma_e
are square roots of the variance
components for groups and errors, respectively \(\left( 0.0156={{0.1249}^{2}},0.036={{0.0601}^{2}}
\right)\).
Note that the RSS is 0.0602 displayed under sigma_e.
The rho
represents the ratio of individual specific error variance to the
composite (entire) error variance, \(0.8119={{0.1249}^{2}}/\left( {{0.1249}^{2}}+{{0.0601}^{2}}
\right)\).
A large ratio – individual specific error account a
large proportion of the composite error variance.
For this RE estimation, the individual specific
error can explain 81% of entire composite error variance.
This ratio may be interpreted as a goodness-of-fit
of RE model.