Dependent variables and regressors can potentially vary over both
time and individual.
Within variation – variation over time or given individual
(time-variant).
Between variation – variation across individual (time-invariant).
Overall variation : variation overtime and individuals.
Individual mean:
\({{\bar{x}}_{i}}=\frac{1}{T}\sum\nolimits_{t}{{{x}_{it}}}\)
Overall means:
\(\bar{x}=\frac{1}{NT}\sum\nolimits_{i}{\sum\nolimits_{t}{{{x}_{it}}}}\)
Overall variance:
\(s_{o}^{2}=\frac{1}{NT-1}{{\sum\nolimits_{i}{\sum\nolimits_{t}{\left(
{{x}_{it}}-\bar{x} \right)}}}^{2}}\)
Between variance:
\(s_{B}^{2}=\frac{1}{NT-1}{{\sum\nolimits_{i}{\left( {{x}_{it}}-\bar{x}
\right)}}^{2}}\)
Within variance:
\(s_{W}^{2}=\frac{1}{NT-1}{{\sum\nolimits_{i}{\sum\nolimits_{t}{\left(
{{x}_{it}}-{{{\bar{x}}}_{i}}
\right)}}}^{2}}=\frac{1}{NT-1}\sum\nolimits_{i}{\sum\nolimits_{t}{{{\left(
{{x}_{it}}-{{{\bar{x}}}_{i}}+\bar{x} \right)}^{2}}}}\)
The overall variation can be decomposed into between variation and
within variation:
\(s_{o}^{2}\approx s_{B}^{2}+s_{W}^{2}\)
We use the data Paneldata01.
To generate this variance decomposition;
xtsum
In xtsum output, Stata uses
lowercase \(n\) to donate the number
of individuals and uppercase \(N\) to donate the total number of individual-time
oberservation.
To tabulates data that provide additional details on within and
between variation of a certain variable;
Overall summary shows 71% of the 4165 individual-year observation
had south=0 and 29% had south = 1.
Between summary indicate from 595 people, 72% had south=0 at least
once and 31% had south=1 at least once.
Within summary indicate 95% of people who ever lived in south always
lived in south during time period covered by the panel, and 98% who lived
outside the south always lived outside the south.
To tabulate data that provides transition probabilities from one
period to the next;
One period is lost in calculating transition (3571 obs are used)
For time-invariant diagonal entries will be 100% and off-diagonal
entries be 0%.
For south, 99.2% of the obs ever in south for one period remain in
the south for the next period. For those did not live in south for one period,
99.7% remain outside south for the next period.
South variable is close to time-invariant.
TIME-SERIES PLOTS FOR EACH
INDIVIDUAL
We will use the graphic line to plot some variable.
To produce graph line of lwage
for the first 20 individuals in the sample separately,
To produce line graph of lwage
for the first 20 individual in the same graph
OVERALL SCATTERPLOT
In case if we want to look the relation between two variables only
or there one key regressor, which is a scatterplot of the dependent variable on
the key regressor using data from all panel obs.
To produce scatter graph
between lwage and exp,
and then add it with fitted linear regression and quadratic regression
line to the scatterplot;
graph
twoway (scatter lwage exp)(lfit lwage exp) (qfit lwage exp)
WITHIN AND BETWEEN
SCATTERPLOT
The option is fe for
within variation, be for
between variation and
re for
random effect variation.
To produce scatterplot for within variation for lwage and exp;
graph twoway (scatter lwage exp)(lfit lwage exp) (qfit lwage exp)
To produce scatterplot for between variation
for lwage and exp;
xtdata,be
graph
twoway (scatter lwage exp)(lfit lwage exp) (qfit lwage exp)
POOLED
OLS REGRESSION WITH CLUSTER-ROBUST STANDARD ERRORS
The individual-spesific-effects
model for the scalar dependent variable \({{y}_{it}}\) specifies that;
\({{y}_{it}}={{\alpha
}_{i}}+{{\text{{x}'}}_{it}}\beta +{{\varepsilon }_{it}}\) (1)
where \({{\text{{x}'}}_{it}}\)
are regressor, \({{\alpha }_{i}}\) are random individual-spesific-effects, and \({{\varepsilon
}_{it}}\) is and idiosyncratic error.
From our data panel, the econometric model that we want estimated
is;
\(lwage=\alpha +{{\beta }_{1}}ex{{p}_{it}}+{{\beta
}_{2}}exp{{2}_{it}}+{{\beta }_{3}}wk{{s}_{it}}+{{\beta
}_{4}}e{{d}_{it}}+{{u}_{it}}\) (2)
Variable ed is
time-invariant while variable exp
and wks is time-variant.
Regressing model Eq(2) yields consistent estimates of \(\beta \)’s
if the composite error \({{u}_{it}}\) is uncorrelated with independent
variables.
But, the \({{u}_{it}}\) likely to be correlated overtime for a given
individual, so we use cluster-robust standard errors that cluster on the
individual.
The option vce(cluster clustervar) will
be used to affects the standard errors and variance-covariance matrix of the
estimators but not the estimated coefficients.
regress
lwage exp exp2 wks ed , vce(cluster id)
Output shows \({{R}^{2}}=0.28\),and the estimates imply that wages
increase with experience until a peak at 31 year [=0.0447/(2 x 0.00072) and
then decline. Wage increase by 0.6% with each additional week worked. And wages
increase by 7.6% with each additional year of education.
No comments:
Post a Comment