MathType

Tuesday, 24 January 2017

Random Effects (RE) Model with Stata (Panel)



If individual effect  (cross-sectional or time specific effect) does not exist \(\left( {{u}_{i}}=0 \right)\) , OLS produces efficient and consistent parameter estimates;

\({{y}_{it}}={{\beta }_{0}}+{{\beta }_{1}}{{x}_{it}}+{{u}_{i}}+{{v}_{it}}\)  (1)                                                               

and we assumed that \(\left( {{u}_{i}}=0 \right)\) .

OLS consists of five core assumptions (Greene,2008; Kennedy,2008)
o   Linearity – the model is linear function.
o   Exogeneity – expected value of disturbance is zero or disturbance are not correlated with any regressor.
o   Homoscedasticity & no autocorrelation.
o   Not stochastic for the independent variable but fixed in repeated samples.
o   Full rank – there is no exact linear relationship among independent variables

There are several strategies for estimating a fixed effect model; the least squares dummy variable (LSDV) model, within estimation and between estimation.

Random Effects (RE) Model

In FE model we had discuss in here, the estimation goal of FE is to eliminate \({{u}_{i}}\)  because it is thought to be correlated with one or more of the \({{x}_{it}}\).
 
But, suppose we assume \({{u}_{i}}\) is uncorrelated with each explanatory variable in all time periods. Then using a transformation to eliminate \({{u}_{i}}\) will results inefficient estimators.

Eq(1) becomes a RE model when we assume that the unobserved effect \({{u}_{i}}\) is uncorrelated with each explanatory variable;

\(Cov\left( {{x}_{ij}},{{u}_{i}} \right)=0\)        (2)

The ideal RE assumptions include all the FE assumptions plus the additional requirement that \({{u}_{i}}\) is independent of all explainatory variables in all time periods.

If we assume the unobserved the \({{u}_{i}}\) is correlated with any explanatory variables, we should use first differencing or FE.

To estimate RE, we define the composite error term as \({{w}_{it}}={{u}_{i}}+{{v}_{it}}\), then the Eq (2.1) can be written as;

\({{y}_{it}}={{\beta }_{0}}+{{\beta }_{1}}{{x}_{it}}+{{w}_{it}}\)         (3)


Because is in the composite error in each time period, the \({{w}_{it}}\)  are serially correlated across time.

Under the RE assumptions;

\(Corr\left( {{w}_{it}},{{w}_{is}} \right)=\sigma _{u}^{2}/\left( \sigma _{u}^{2}+\sigma _{v}^{2} \right),t\ne s\)        (4)

where \(\sigma _{u}^{2}=Var\left( {{u}_{i}} \right)\)  and \(\sigma _{v}^{2}=Var\left( {{v}_{it}} \right)\)  

The RE is estimated by GLS when the covariance structure is known, and by FGLS or EGLS when the covariance structure of composite error is unknown.

Compared to FE model, a RE model is relatively difficult to estimate. In FGLS, we first have to estimate \(\theta \)  using  \(\hat{\sigma }_{u}^{2}\) and \(\hat{\sigma }_{v}^{2}\) .

The \(\hat{\sigma }_{u}^{2}\) comes from the between effect estimation (group mean regression) and \(\hat{\sigma }_{v}^{2}\) is derived from the RSS of the within effect estimation of the deviation of residuals from group means of residual;


\(\hat{\theta }=1-\sqrt{\frac{\hat{\sigma }_{v}^{2}}{T\hat{\sigma }_{u}^{2}+\hat{\sigma }_{v}^{2}}}=1-\sqrt{\frac{\hat{\sigma }_{v}^{2}}{T\hat{\sigma }_{between}^{2}}}\)           (5)


where  \(\hat{\sigma }_{u}^{2}=\hat{\sigma }_{between}^{2}-\frac{\hat{\sigma }_{v}^{2}}{T}\), where \(\hat{\sigma }_{between}^{2}=\frac{RS{{S}_{between}}}{n-k-1}\) ,
\(\hat{\sigma }_{v}^{2}=\frac{RS{{S}_{within}}}{nT-n-k}=\frac{e'{{e}_{within}}}{nT-n-k}=\frac{\sum\nolimits_{i=1}^{n}{\sum\nolimits_{t=1}^{T}{{{\left( {{v}_{it}}-{{{\bar{v}}}_{i}} \right)}^{2}}}}}{nT-n-k}\), where   are the residual of the LSDV.

Then, the dependent variable, independent variables, and the intercept term need to be transformed as follows;

\(y_{it}^{*}={{y}_{it}}-\hat{\theta }{{\bar{y}}_{i}}\)        (6)
\(x_{it}^{*}={{x}_{it}}-\hat{\theta }{{\bar{x}}_{i}}\)        (7)
\(\beta _{0}^{*}=1-\hat{\theta }\)          (8)

Finally, run OLS on those transformed variables , Eq(6), (7) and (8) with the traditional intercept suppressed;

\(y_{it}^{*}=\beta _{0}^{*}+{{\beta }_{1}}x_{it}^{*}+\varepsilon _{it}^{*}\)       (9)


Estimation using Stata

For our discussion on the RE using Stata, lets we use the data airline.dta again as we discuss the FE model in here  and we want to estimate the effects of output, fuel and loading factor to the cost of airline companies;

\(cos{{t}_{it}}={{\beta }_{0}}+{{\beta }_{1}}outpu{{t}_{it}}+{{\beta }_{2}}fue{{l}_{it}}+{{\beta }_{3}}loa{{d}_{it}}+{{v}_{it}}\)      (10)                


where;
\(cos{{t}_{it}}\)               = cost of airline companies
\(outpu{{t}_{it}}\)           = revenue passenger mile (output index)
\(fue{{l}_{it}}\)               = fuel prices
\(loa{{d}_{it}}\)              = loading factor (average capacity utilization of the fleet)


Now, lets us regress the Eq(10) by the pooled OLS

reg cost output fuel load

 

Now, lets we regress the RE model. The estimation of the RE model require that we need to estimate the Eq(3) first and then get the value of \(\theta \)  manually as in Eq(5). After that, we need to transform the data based on the value of  as in Eq(6), Eq(7) and Eq(8) also in manually and then regress the RE model by OLS as in Eq(9).

In Stata, we can skip the procedure of calculation and estimation manually from Eq(5) through Eq(9). Thanks to Stata for the command xtreg,re which the Stata estimate the Eq(9) automatically to get the output in RE estimation.

Before we run the xtreg command, we need to specifies first the cross-sectional and time series variables,

xtset airline year
 

To estimate the RE model as in  Eq(9);

xtreg cost output fuel load,re theta

 


The sigma_u  and sigma_e  are square roots of the variance components for groups and errors, respectively \(\left( 0.0156={{0.1249}^{2}},0.036={{0.0601}^{2}} \right)\).

Note that the RSS is 0.0602 displayed under sigma_e.

The rho represents the ratio of individual specific error variance to the composite (entire) error variance, \(0.8119={{0.1249}^{2}}/\left( {{0.1249}^{2}}+{{0.0601}^{2}} \right)\).

A large ratio – individual specific error account a large proportion of the composite error variance.
For this RE estimation, the individual specific error can explain 81% of entire composite error variance.

This ratio may be interpreted as a goodness-of-fit of RE model.







No comments:

Post a Comment