Download the notebook here
!
Interactive online version:
Replication of Angrist (1990): Lifetime earnings and the Vietnam era draft lottery: Evidence from social security administrative records
Project by Pascal Heid, Summer 2020.
[1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from auxiliary.auxiliary_figures import get_figure1, get_figure2, get_figure3
from auxiliary.auxiliary_tables import (
get_table1,
get_table2,
get_table3,
get_table4,
)
from auxiliary.auxiliary_data import process_data
from auxiliary.auxiliary_visuals import background_negative_green, p_value_star
from auxiliary.auxiliary_extensions import (
get_flexible_table4,
get_figure1_extension1,
get_figure2_extension1,
get_bias,
get_figure1_extension2,
get_figure2_extension2,
)
import warnings
warnings.filterwarnings("ignore")
plt.rcParams["figure.figsize"] = [12, 6]
This notebook replicates the core results of the following paper:
Angrist, Joshua. (1990). Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records. American Economic Review. 80. 313-36.
In the following just a few notes on how to read the remainder:
In this excerpt I replicate the Figures 1 to 3 and the Tables 1 to 4 (in some extended form) while I do not consider Table 5 to be a core result of the paper which is why it cannot be found in this notebook.
I follow the example of Angrist keeping his structure throughout the replication part of this notebook.
The naming and order of appearance of the figures does not follow the original paper but the published correction.
The replication material including the partially processed data as well as some replication do-files can be found here.
1. Introduction
For a soft introduction to the topic, let us have a look at the goal of Angrist’s article. Already in the first few lines Angrist states a clear-cut aim for his paper by making the remark that “yet, academic research has not shown conclusively that Vietnam (or other) veterans are worse off economically than nonveterans”. He further elaborates on why research had yet been so inconclusive. He traces it back to the flaw that previous research had solely tried to estimate the effect of veteran status on subsequent earnings by comparing the latter across individuals differing in veteran status. He argues that this naive estimate might likely be biased as it is easily imaginable that specific types of men choose to enlist in the army whose unobserved characteristics imply low civilian earnings (self-selcetion on unobservables).
Angrist avoids this pitfall by employing an instrumental variable strategy to obtain unbiased estimates of the effect of veteran status on earnings. For that he exploits the random nature of the Vietnam draft lottery. This lottery randomly groups people into those that are eligible to be forced to join the army and those that are not. The idea is that this randomly affects the veteran status without being linked to any unobserved characteristics that cause earnings. This allows Angrist to obtain an estimate of the treatment effect that does not suffer from the same shortcomings as the ones of previous studies.
He finds that Vietnam era veterans are worse off when it comes to long term annual real earnings as opposed to those that have not served in the army. In a secondary point he traces this back to the loss of working experience for veterans due to their service by estimating a simple structural model.
In the following sections I first walk you through the identification idea and empirical strategy. Secondly, I replicate and explain the core findings of the paper with a rather extensive elaboration on the different data sources used and some additional visualizations. Thirdly, I critically assess the paper followed by my own two extensions concluding with some overall remarks right after.
2. Identification and Empirical Approach
As already mentioned above the main goal of Angrist’s paper is to determine the causal effect of veteran status on subsequent earnings. He believes for several reasons that conventional estimates that only compare earnings by veteran status are biased due to unobservables that affect both the probability of serving in the military as well as earnings over lifetime. This is conveniently shown in the causal graph below. Angrist names two potential reasons why this might be likely. First of all, he makes the point that probably people with few civilian opportunities (lower expected earnings) are more likely to register for the army. Without a measure for civilian opportunities at hand a naive estimate of the effect of military service on earnings would not be capable of capturing the causal effect. Hence, he believes that there is probably some self-selection into treatment on unobservables by individuals. In a second point, Angrist states that the selection criteria of the army might be correlated with unobserved characteristics of individuals that makes them more prone to receiving future earnings pointing into a certain direction.
Econometrically spoken, Angrist argues with the following linear regression equation representing a version of the right triangle in the causal graph:
He argues that estimating the above model with the real earnings \(y_{cti}\) for an individual \(i\) in cohort \(c\) at time \(t\) being determined by cohort and time fixed effects (\(\beta_c\) and \(\delta_t\)) as well an individual effect for veteran status is biased. This is for the above given reasons that the indicator for veteran status \(s_i\) is likely to be correlated with the error term \(u_{it}\).
Angrist’s approach to avoid bias is now to employ an instrumental variable approach which is based on the accuracy of the causal graph below.
The validity of this causal graph rests on the crucial reasoning that there is no common cause of the instrument (Draft Lottery) and the unobserved variables (U). Angrist provides the main argument that the draft lottery was essentially random in nature and hence is not correlated with any personal characteristics and therefore not linked to any unobservables that might determine military service and earnings. As will be later explained in more detail, the Vietnam draft lottery determined randomly on the basis of the birth dates whether a person is eligible to be drafted by the army in the year following the lottery. The directed graph from Draft Lottery to Military Service is therefore warranted as the fact of having a lottery number rendering a person draft-eligible increases the probability of joining the military as opposed to a person that has an excluded lottery number.
This argumentation leads Angrist to use the probability of being a veteran conditional on being draft-eligible in the lottery as an instrument for the effect of veteran status on earnings. In essence this is the Wald estimate which is equal to the following formula:
The nominator equals to the estimated \(\alpha\) from equation (1) while the denominator can be obtained by a first stage regression which regresses veteran status on draft-eligibility. It reduces to estimating the difference in conditional probabilities of being a veteran \(prob(veteran \mid eligible = 1) - prob(veteran \mid eligible = 0)\). Estimates for this are obtained by Angrist through weighted least squares (WLS). This is done as Angrist does not have micro data but just grouped data (for more details see the data section in the replication). In order to obtain the estimates of the underlying micro level data it is necessary to adjust OLS by the size of the respective groups as weights. The above formula is also equivalent to a Two Stage Least Squares (2SLS) procedure in which earnings are regressed on the fitted values from a first stage regression of veteran status on eligibility.
In a last step, Angrist generalizes the Wald grouping method to more than just one group as instrument. There are 365 lottery numbers that were split up into two groups (eligible and non-eligible) for the previous Wald estimate. Those lottery numbers can also be split up even further into many more subgroups than just two, resulting in many more dummy variables as instruments. Angrist splits the lottery numbers into intervals of five which determine a group \(j\). By cohort \(c\) he estimates for each group \(j\) the conditional probability of being a veteran \(p_{cj}\). This first stage is again run by WLS. The resulting estimate \(\hat p_{cj}\) is then used to conduct the second stage regression below.
The details and estimation technique will be further explained when presenting the results in the replication section below.
3. Replication
3.1 Background and Data
The Vietnam Era Draft Lottery
Before discussing how the data looks like it is worthwhile to understand how the Vietnam era draft lottery was working in order to determine to which extent it might actually serve as a valid instrument. During the Vietnam war there were several draft lotteries. They were held in the years from 1970 to 1975. The first one took place at the end of 1969 determining which men might be drafted in the following year. This procedure of determining the lottery numbers for the following year continued until 1975. The table below shows for which years there were lotteries drawn and which birth years were affected by them in the respective year. For more details have a look here.
Year |
Cohorts |
Draft-Eligibility Ceiling |
---|---|---|
1970 |
1944-50 |
195 |
1971 |
1951 |
125 |
1972 |
1952 |
95 |
1973 |
1953 |
95 |
1974 |
1954 |
95 |
1975 |
1955 |
95 |
1976 |
1956 |
95 |
The authority of drafting men for the army through the lottery expired on June 30, 1973 and already before no one was drafted anymore. The last draft call took place on December 7, 1972.
The general functioning of those seven lotteries was that every possible birthday (365 days) was randomly assigned a number between 1 and 365 without replacement. Taking the 1969 lottery this meant that the birthdate that had the number 1 assigned to, it caused every man born on that day in the years 1944 to 1950 to be drafted first if it came to a draft call in the year 1970. In practice, later in the same year of the draft lottery, the army announced draft-eligibility ceilings determining up to which draft lottery number was called in the following year. In 1970, this means that every man having a lottery number of below 195 was called to join the army. As from 1973 on nobody was called anymore, the numbers for the ceiling are imputed from the last observed one which was 95 in the year 1972. Men with lottery numbers below the ceiling for their respective year are from here on called “draft-eligible”.
Being drafted did not mean that one actually had to serve in the army, though. Those drafted had to pass mental and physical tests which in the end decided who had to join. Further it should be mentioned that Angrist decides to only use data on those that turned 19 when being at risk of induction which includes men born between 1950 and 1953.
The Data
Continuous Work History Sample (CWHS)
This administrative data set constitutes a random one percent sample draw of all possible social security numbers in the US. For the years from 1964 until 1984 it includes the FICA (social security) earnings history censored to the Social Security maximum taxable amount. It further includes FICA taxable earnings from self-employment. For the years from 1978 on it also has a series on total earnings (Total W-2) including for instance cash payments but excluding earnings from self-employment. This data set has some confidentiality restrictions which means that only group averages and variances were available. This means that Angrist cannot rely on micro data but has to work with sample moment which is a crucial factor for the exact implementation of the IV method. A group is made of by year of earnings, year of birth, ethnicity and five consecutive lottery numbers. The statistics collected for those also include the number of people in the group, the fraction of them having taxable earnings equal and above the taxable maximum and the fraction having zero earnings.
Regarding the actual data sets available for replication we have the data set cwhsa
which consists of the above data for the years from 1964 to 1977 and then cwhsb
which consists of the CWHS for the years after.
Above that Angrist provides the data set cwhsc_new
which includes the adjusted FICA earnings. For those Angrist employed a strategy to approximate the underlying uncensored FICA earnings from the reported censored ones. All of those three different earnings variables are used repeatedly throughout the replication.
[3]:
process_data("cwhsa")
[3]:
earnings | earnings variance | sample size | fraction zero earnings | ||||
---|---|---|---|---|---|---|---|
ethnicity | birth year | year | lottery interval | ||||
1 | 44 | 64 | 1 | 1691.030029 | 1480.599976 | 182.0 | 0.170 |
2 | 1535.430054 | 1359.020020 | 187.0 | 0.187 | |||
3 | 1818.010010 | 1604.420044 | 210.0 | 0.171 | |||
4 | 1636.380005 | 1626.270020 | 208.0 | 0.231 | |||
5 | 1889.800049 | 1639.609985 | 207.0 | 0.184 | |||
... | ... | ... | ... | ... | ... | ... | ... |
2 | 53 | 77 | 69 | 3643.739990 | 4273.600098 | 53.0 | 0.415 |
70 | 4127.490234 | 5623.089844 | 55.0 | 0.473 | |||
71 | 4712.459961 | 4588.279785 | 76.0 | 0.316 | |||
72 | 4676.939941 | 5321.140137 | 85.0 | 0.353 | |||
73 | 4651.870117 | 4989.020020 | 83.0 | 0.241 |
20440 rows × 4 columns
The above earnings data only consists of FICA earnings. The lottery intervals from 1 to 73 are equivalent to intervals of five consecutive lottery numbers. Consequently, the variable lottery interval equals to one for the lottery numbers 1 to 5 and so on. The ethnicity variable is encoded as 1 for a white person and 2 for a nonwhite person.
[4]:
process_data("cwhsb")
[4]:
earnings | earnings variance | sample size | fraction zero earnings | |||||
---|---|---|---|---|---|---|---|---|
data source | ethnicity | birth year | year | lottery interval | ||||
TAXAB | 1 | 44 | 78 | 1 | 10625.58 | 7052.47 | 179 | 0.179 |
2 | 11546.46 | 8032.55 | 182 | 0.198 | ||||
3 | 11401.16 | 7508.27 | 209 | 0.196 | ||||
4 | 10899.99 | 7342.60 | 206 | 0.189 | ||||
5 | 11667.14 | 7507.56 | 207 | 0.159 | ||||
... | ... | ... | ... | ... | ... | ... | ... | ... |
TOTAL | 2 | 53 | 84 | 69 | 6846.43 | 9117.49 | 53 | 0.396 |
70 | 11357.89 | 14734.47 | 55 | 0.455 | ||||
71 | 8695.86 | 9613.24 | 76 | 0.368 | ||||
72 | 14013.24 | 14182.30 | 84 | 0.274 | ||||
73 | 10742.71 | 18095.78 | 83 | 0.506 |
20440 rows × 4 columns
As stated above this data now consists of earnings from 1978 to 1984 for FICA (here encoded as “TAXAB”) and Total W-2 (encoded as “TOTAL”).
Survey of Income and Program Participation (SIPP) and the Defense Manpower Data Center (DMDC)
Throughout the paper it is necessary to have a measure of the fraction of people serving in the military. For this purpose the above two data sources are used.
The SIPP is a longitudinal survey of around 20,000 households in the year 1984 for which is determined whether the persons in the household are Vietnam war veterans. The survey also collected data on ethnicity and birth data which made it possible to match the data to lottery numbers. The DMDC on the other hand is an administrative record which shows the total number of new entries into the army by ethnicity, cohort and lottery number per year from mid 1970 until the end of 1973. Those
sources are needed for the results in Table 3 and 4. A combination of those two are matched to the earnings data of the CWHS which constitutes the data set chwsc_new
below.
[5]:
data_cwhsc_new = process_data("cwhsc_new")
data_cwhsc_new
[5]:
earnings | probability of serving | |||||
---|---|---|---|---|---|---|
data source | ethnicity | birth year | year | lottery interval | ||
ADJ | 1 | 50 | 74 | 1 | 8853.940430 | 0.352700 |
75 | 1 | 9062.639648 | 0.352700 | |||
76 | 1 | 10096.055664 | 0.352700 | |||
77 | 1 | 10916.072266 | 0.352700 | |||
78 | 1 | 11738.444336 | 0.352700 | |||
... | ... | ... | ... | ... | ... | ... |
TOTAL | 2 | 53 | 84 | 37 | 10562.357422 | 0.111818 |
57 | 8988.295898 | 0.082410 | ||||
40 | 9857.195312 | 0.111429 | ||||
11 | 8690.839844 | 0.088025 | ||||
23 | 9709.985352 | 0.073750 |
12818 rows × 2 columns
This data set now also includes the adjusted FICA earnings which are marked by “ADJ” as well as the probability of serving in the military conditional on being in a group made up by ethnicity, birth cohort and lottery interval.
Below we have a short look at how the distribution of the different earnings measures look like. In the table you see the real earnings in 1978 dollar terms for the years from 1974 to 1984 for FICA and adjusted FICA as well as the years 1978 until 1984 for Total W-2.
[6]:
for data in ["ADJ", "TAXAB", "TOTAL"]:
ax = sns.kdeplot(
data_cwhsc_new.loc[data, "earnings"],
color=np.random.choice(np.array([sns.color_palette()]).flatten(), 4),
)
ax.set_xlim(xmax=20000)
ax.legend(["Adjusted FICA", "FICA", "TOTAL W-2"], loc="upper left")
ax.set_title("Kernel Density of the different Earning Measures")
[6]:
Text(0.5, 1.0, 'Kernel Density of the different Earning Measures')
For a more detailed description of the somewhat confusing original variable names in the data sets please refer to the appendix at the very bottom of the notebook.
3.2 Establishing the Validity of the Instrument
In order to convincingly pursue the identification strategy outlined above it is necessary to establish an effect of draft eligibility (the draft lottery) on veteran status and to argue that draft eligibility is exogenous to any unobserved factor affecting both veteran status and subsequent earnings. As argued before one could easily construct reasonable patterns of unobservables that both cause veteran status and earnings rendering a naive regression of earnings on veteran status as biased.
The first requirement for IV to be valid holds as it is clearly observable that draft-eligibility has an effect on veteran status. The instrument is hence relevant. For the second part Angrist argues that the draft lottery itself is random in nature and hence not correlated with any unobserved characteristics (exogenous) a man might have that causes him to enroll in the army while at the same time making his earnings likely to go into a certain direction irrespective of veteran status.
On the basis of this, Angrist now shows that subsequent earnings are affected by draft eligibility. This is the foundation to find a nonzero effect of veteran status on earnings. Going back to the causal diagram from before, Angrist argued so far that there is no directed graph from Draft Lottery to the unobservables U but only to Military Service. Now he further establishes the point that there is an effect of draft-eligibility (Draft Lottery) that propagates through Military Service onto earnings (Wages).
In order to see this clearly let us have a look at Figure 1 of the paper below. For white and nonwhite men separately the history of average FICA earnings in 1978 dollar terms is plotted. This is done by year within cohort across those that were draft-eligible and those that were not. The highest two lines represent the 1950 cohort going down to the cohort of men born in 1953. There is a clearly observable pattern among white men in the cohorts from 1950 to 52 which shows persistently lower earnings for those draft-eligible starting the year in which they could be drafted. This cannot be seen for those born in 1953 which is likely due to the fact that nobody was actually drafted in 1973 which would have otherwise been “their” year. For nonwhite men the picture is less clear. It seems that for cohorts 50 to 52 there is slightly higher earnings for those ineligible but this does not seem to be persistent over time. The cohort 1953 again does not present a conclusive image. Observable in all lines, though, is that before the year of conscription risk there is no difference in earnings among the group which is due to the random nature of the draft lottery.
[7]:
# read in the original data sets
data_cwhsa = pd.read_stata("data/cwhsa.dta")
data_cwhsb = pd.read_stata("data/cwhsb.dta")
data_cwhsc_new = pd.read_stata("data/cwhsc_new.dta")
data_dmdc = pd.read_stata("data/dmdcdat.dta")
data_sipp = pd.read_stata("data/sipp2.dta")
[8]:
get_figure1(data_cwhsa, data_cwhsb)
A more condensed view of the results in Figure 1 is given in Figure 2. It depicts the differences in earnings between the red and the black line in Figure 1 by cohort and ethnicity. This is just included for completeness as it does not provide any further insight in comparison to Figure 1.
[9]:
get_figure2(data_cwhsa, data_cwhsb)
A further continuation of this line of argument is resulting in Table 1. Angrist makes the observations from the figures before even further fine-grained and explicit. In Table 1 Angrist estimates the expected difference in average FICA and Total W-2 earnings by ethnicity within cohort and year of earnings. In the table below for white men we can observe that there is no significant difference to the five percent level for the years before the year in which they might be drafted. This changes for the cohorts from 1950 to 52 in the years 1970 to 72, respectively. There we can observe a significantly lower income for those eligible in comparison to those ineligible. This seems to be persistent for the cohorts 1950 and 52 while less so for those born in 1951 and 1953. It should further be noted that Angrist reports that the quality of the Total W-2 earnings data was low in the first years (it was launched in 1972) explaining the inconlusive estimations in the periods at the beginning.
To focus the attention on the crucial points I mark all the negative estimates in different shades of green with more negative ones being darker. This clearly emphasizes the verbal arguments brought up before.
[10]:
table1 = get_table1(data_cwhsa, data_cwhsb)
table1["white"].style.applymap(background_negative_green)
[10]:
type | FICA | TOTAL | |||||||
---|---|---|---|---|---|---|---|---|---|
byr | 50 | 51 | 52 | 53 | 50 | 51 | 52 | 53 | |
year | Statistic | ||||||||
66 | Average | -21.810000 | |||||||
Standard Error | 14.990000 | ||||||||
67 | Average | -8.020000 | 13.170000 | ||||||
Standard Error | 18.210000 | 16.450000 | |||||||
68 | Average | -14.900000 | 12.340000 | -8.960000 | |||||
Standard Error | 24.200000 | 19.500000 | 19.250000 | ||||||
69 | Average | -2.100000 | 18.790000 | 11.420000 | -4.090000 | ||||
Standard Error | 34.580000 | 26.470000 | 22.780000 | 18.340000 | |||||
70 | Average | -233.870000 | -44.830000 | -5.070000 | 32.940000 | ||||
Standard Error | 39.720000 | 36.700000 | 29.380000 | 24.200000 | |||||
71 | Average | -325.950000 | -298.210000 | -29.420000 | 27.680000 | ||||
Standard Error | 46.630000 | 41.780000 | 40.260000 | 30.350000 | |||||
72 | Average | -203.580000 | -197.450000 | -261.600000 | 2.130000 | ||||
Standard Error | 55.420000 | 51.180000 | 46.890000 | 42.920000 | |||||
73 | Average | -226.650000 | -228.860000 | -357.780000 | -56.580000 | ||||
Standard Error | 67.840000 | 61.640000 | 56.260000 | 54.810000 | |||||
74 | Average | -243.040000 | -155.460000 | -402.740000 | -15.060000 | ||||
Standard Error | 81.450000 | 75.330000 | 68.380000 | 68.150000 | |||||
75 | Average | -295.240000 | -99.210000 | -304.590000 | -28.300000 | ||||
Standard Error | 94.420000 | 89.790000 | 85.010000 | 79.630000 | |||||
76 | Average | -314.220000 | -86.870000 | -370.780000 | -145.510000 | ||||
Standard Error | 106.620000 | 102.940000 | 98.300000 | 93.080000 | |||||
77 | Average | -262.640000 | -274.230000 | -396.970000 | -85.510000 | ||||
Standard Error | 117.910000 | 112.260000 | 111.180000 | 107.140000 | |||||
78 | Average | -205.400000 | -203.880000 | -467.100000 | -65.320000 | 1059.400000 | 233.270000 | 175.360000 | -1974.550000 |
Standard Error | 132.710000 | 127.040000 | 127.300000 | 123.190000 | 2159.340000 | 1609.440000 | 1567.940000 | 912.110000 | |
79 | Average | -263.610000 | -60.530000 | -236.900000 | 89.280000 | -1588.720000 | 523.690000 | -580.860000 | -557.940000 |
Standard Error | 160.590000 | 152.390000 | 153.920000 | 148.700000 | 1575.610000 | 1590.540000 | 736.750000 | 750.140000 | |
80 | Average | -339.160000 | -267.980000 | -312.110000 | -93.880000 | -1028.120000 | 85.630000 | -581.320000 | -428.730000 |
Standard Error | 183.250000 | 175.310000 | 178.230000 | 170.740000 | 756.860000 | 599.870000 | 309.170000 | 341.540000 | |
81 | Average | -435.830000 | -358.320000 | -342.890000 | 34.390000 | -589.670000 | -71.610000 | -440.530000 | -109.540000 |
Standard Error | 210.590000 | 203.670000 | 206.880000 | 199.070000 | 299.430000 | 423.400000 | 265.080000 | 245.250000 | |
82 | Average | -320.200000 | -117.310000 | -235.120000 | 29.490000 | -305.540000 | -72.760000 | -514.710000 | 18.720000 |
Standard Error | 235.860000 | 229.140000 | 232.380000 | 222.660000 | 345.490000 | 372.160000 | 296.570000 | 281.910000 | |
83 | Average | -349.580000 | -314.060000 | -437.740000 | -96.370000 | -512.940000 | -896.550000 | -915.710000 | 30.160000 |
Standard Error | 261.670000 | 253.270000 | 257.550000 | 248.770000 | 441.220000 | 426.380000 | 395.260000 | 318.120000 | |
84 | Average | -484.390000 | -398.460000 | -436.060000 | -228.680000 | -1143.320000 | -809.200000 | -767.240000 | -164.210000 |
Standard Error | 286.830000 | 279.260000 | 281.930000 | 272.260000 | 492.270000 | 380.960000 | 376.060000 | 366.100000 |
For the nonwhite males there is no clear cut pattern. Only few cells show significant results which is why Angrist in the following focuses on white males when constructing IV estimates. For completeness I present Table 1 for nonwhite males below although it is somewhat less important for the remainder of the paper.
[11]:
table1["nonwhite"].style.applymap(background_negative_green)
[11]:
type | FICA | TOTAL | |||||||
---|---|---|---|---|---|---|---|---|---|
byr | 50 | 51 | 52 | 53 | 50 | 51 | 52 | 53 | |
year | Statistic | ||||||||
66 | Average | -11.880000 | |||||||
Standard Error | 27.690000 | ||||||||
67 | Average | 12.910000 | -4.030000 | ||||||
Standard Error | 34.230000 | 30.660000 | |||||||
68 | Average | -29.540000 | -6.290000 | -12.040000 | |||||
Standard Error | 44.510000 | 37.400000 | 35.050000 | ||||||
69 | Average | -5.130000 | 67.800000 | 3.450000 | -42.420000 | ||||
Standard Error | 66.850000 | 53.410000 | 43.420000 | 36.490000 | |||||
70 | Average | -99.820000 | 62.250000 | 24.750000 | -0.950000 | ||||
Standard Error | 78.600000 | 75.740000 | 62.270000 | 45.000000 | |||||
71 | Average | -164.810000 | -144.310000 | -25.080000 | 18.230000 | ||||
Standard Error | 92.750000 | 86.500000 | 85.190000 | 60.790000 | |||||
72 | Average | -188.880000 | -156.720000 | -208.280000 | 60.440000 | ||||
Standard Error | 113.610000 | 105.730000 | 104.280000 | 92.830000 | |||||
73 | Average | -85.730000 | -134.890000 | -175.680000 | 115.590000 | ||||
Standard Error | 137.790000 | 127.080000 | 129.090000 | 119.480000 | |||||
74 | Average | -179.350000 | -96.710000 | -181.420000 | 216.590000 | ||||
Standard Error | 165.090000 | 160.130000 | 155.650000 | 145.200000 | |||||
75 | Average | -190.350000 | -236.150000 | -183.730000 | 111.640000 | ||||
Standard Error | 189.320000 | 186.810000 | 185.880000 | 166.950000 | |||||
76 | Average | -105.340000 | -333.790000 | -308.910000 | -46.400000 | ||||
Standard Error | 214.710000 | 215.410000 | 216.540000 | 199.360000 | |||||
77 | Average | 112.430000 | -206.880000 | -251.130000 | 153.510000 | ||||
Standard Error | 238.500000 | 240.490000 | 248.540000 | 233.510000 | |||||
78 | Average | 163.670000 | -108.610000 | -424.930000 | 381.910000 | -1145.070000 | 2978.240000 | -4676.250000 | -482.800000 |
Standard Error | 272.670000 | 269.280000 | 279.480000 | 275.770000 | 2395.620000 | 2869.680000 | 1393.130000 | 2206.090000 | |
79 | Average | 187.040000 | -210.310000 | -391.710000 | 312.040000 | 4005.420000 | 1545.070000 | -494.790000 | -1043.330000 |
Standard Error | 317.210000 | 323.080000 | 324.830000 | 326.330000 | 2721.280000 | 2191.150000 | 2683.890000 | 1660.240000 | |
80 | Average | 203.250000 | 4.810000 | -212.660000 | 344.080000 | 790.240000 | 376.470000 | -292.700000 | 288.700000 |
Standard Error | 363.100000 | 368.410000 | 372.530000 | 370.320000 | 648.170000 | 533.690000 | 441.000000 | 416.500000 | |
81 | Average | 534.520000 | 313.200000 | -305.860000 | 717.820000 | 802.590000 | 415.980000 | -272.360000 | 784.410000 |
Standard Error | 413.580000 | 419.190000 | 429.110000 | 433.730000 | 524.630000 | 745.170000 | 492.870000 | 503.150000 | |
82 | Average | 285.160000 | 175.470000 | -262.570000 | 810.470000 | 326.040000 | -244.340000 | -160.220000 | 675.160000 |
Standard Error | 461.290000 | 471.650000 | 476.750000 | 486.300000 | 608.970000 | 647.840000 | 590.010000 | 564.100000 | |
83 | Average | 96.070000 | 419.560000 | -177.340000 | 543.640000 | 315.480000 | 254.330000 | -53.640000 | 462.350000 |
Standard Error | 512.620000 | 538.170000 | 531.510000 | 523.260000 | 720.000000 | 767.600000 | 643.490000 | 638.970000 | |
84 | Average | -76.870000 | -223.190000 | -123.400000 | 641.350000 | -287.440000 | -718.610000 | -288.100000 | 827.400000 |
Standard Error | 548.220000 | 562.880000 | 568.600000 | 568.200000 | 804.100000 | 771.590000 | 721.010000 | 716.810000 |
3.3 Measuring the Effect of Military Service on Earnings
3.3.1 Wald-estimates
As discussed in the identification section a simple OLS regression estimating the model in equation (1) might suffer from bias due to elements of \(s_i\) that are correlated with the error term \(u_{it}\). This problem can be to a certain extent circumvented by the grouping method proposed by Abraham Wald (1940). Grouping the data by the instrument which is draft eligibility status makes it possible to uncover the effect of veteran status on earnings. An unbiased estimate of \(\alpha\) can therefore be found by adjusting the difference in mean earnings across eligibility status by the difference in probability of becoming a veteran conditional on being either draft eligible or not. This verbal explanation is translated in the following formula:
The variable \(\bar y\) captures the mean earnings within a certain cohort and year further defined by the superscript \(e\) or \(n\) which indicates draft-eligibility status. The above formula poses the problem that the conditional probabilities of being a veteran cannot be obtained from the CWHS data set alone. Therefore in Table 2 Angrist attempts to estimate them from two other sources. First from the SIPP which has the problem that it is a quite small sample. And secondly, he matches the CWHS data to the DMDC. Here it is problematic, though, that the amount of people entering the army in 1970 (which is the year when those born 1950 were drafted) is only collected for the second half of the year. This is the reason why Angrist has to go with the estimates from the SIPP for the cohort of 1950 while taking the bigger sample of the matched DMDC/CWHS for the birth years 1951 to 53. The crucial estimates needed for the denominator of equation (3) are presented in the last column of Table 2 below. It can already be seen that the differences in earnings by eligibility that we found in Table 1 will be scaled up quite a bit to obtain the estimates for \(\hat{\alpha}\). We will come back to that in Table 3.
Note: The cohort 1950 for the DMDC/CWHS could not be replicated as the data for cohort 1950 from the DMDC set is missing in the replication data. Above that the standard errors for the estimates coming form SIPP differ slightly from the published results but are equal to the results from the replication code.
[12]:
table2 = get_table2(data_cwhsa, data_dmdc, data_sipp)
table2["white"]
[12]:
Sample | P(Veteran) | P(Veteran|eligible) | P(Veteran|ineligible) | P(V|eligible) - P(V|ineligible) | |||
---|---|---|---|---|---|---|---|
Data Set | Cohort | Statistic | |||||
SIPP (84) | 1950 | Value | 351.0 | 0.2673 | 0.3527 | 0.1934 | 0.1594 |
Standard Error | 0.0136 | 0.0215 | 0.0166 | 0.0272 | |||
1951 | Value | 359.0 | 0.1973 | 0.2831 | 0.1469 | 0.1362 | |
Standard Error | 0.0124 | 0.0230 | 0.0139 | 0.0269 | |||
1952 | Value | 336.0 | 0.1554 | 0.2310 | 0.1257 | 0.1053 | |
Standard Error | 0.0111 | 0.0245 | 0.0119 | 0.0273 | |||
1953 | Value | 390.0 | 0.1298 | 0.2192 | 0.1126 | 0.1066 | |
Standard Error | 0.0102 | 0.0313 | 0.0104 | 0.0330 | |||
DMDC/CWHS | 1951 | Value | 16768.0 | 0.1176 | 0.2071 | 0.0708 | 0.1362 |
Standard Error | 0.0025 | 0.0053 | 0.0024 | 0.0059 | |||
1952 | Value | 17703.0 | 0.1515 | 0.2683 | 0.1102 | 0.1581 | |
Standard Error | 0.0027 | 0.0065 | 0.0027 | 0.0071 | |||
1953 | Value | 17749.0 | 0.1343 | 0.1548 | 0.1268 | 0.0280 | |
Standard Error | 0.0026 | 0.0053 | 0.0029 | 0.0060 |
[13]:
table2["nonwhite"]
[13]:
Sample | P(Veteran) | P(Veteran|eligible) | P(Veteran|ineligible) | P(V|eligible) - P(V|ineligible) | |||
---|---|---|---|---|---|---|---|
Data Set | Cohort | Statistic | |||||
SIPP (84) | 1950 | Value | 70.0 | 0.1625 | 0.1957 | 0.1355 | 0.0603 |
Standard Error | 0.0281 | 0.0449 | 0.0353 | 0.0571 | |||
1951 | Value | 63.0 | 0.1703 | 0.2014 | 0.1514 | 0.0500 | |
Standard Error | 0.0283 | 0.0497 | 0.0340 | 0.0603 | |||
1952 | Value | 52.0 | 0.1332 | 0.1449 | 0.1288 | 0.0161 | |
Standard Error | 0.0265 | 0.0525 | 0.0308 | 0.0609 | |||
1953 | Value | 55.0 | 0.1749 | 0.2247 | 0.1642 | 0.0605 | |
Standard Error | 0.0297 | 0.0762 | 0.0321 | 0.0827 | |||
DMDC/CWHS | 1951 | Value | 5258.0 | 0.0794 | 0.1173 | 0.0599 | 0.0574 |
Standard Error | 0.0037 | 0.0076 | 0.0040 | 0.0086 | |||
1952 | Value | 5493.0 | 0.0953 | 0.1439 | 0.0794 | 0.0644 | |
Standard Error | 0.0040 | 0.0095 | 0.0042 | 0.0104 | |||
1953 | Value | 5303.0 | 0.0925 | 0.0984 | 0.0904 | 0.0080 | |
Standard Error | 0.0040 | 0.0079 | 0.0046 | 0.0092 |
In the next step Angrist brings together the insights gained so far from his analysis. Table 3 presents again differences in mean earnings across eligibility status for different earnings measures and within cohort and year. The values in column 1 and 3 are directly taken from Table 1. In column 2 we now encounter the adjusted FICA measure for the first time. As a reminder, it consists of the scaled up FICA earnings as FICA earnings are only reported to a certain maximum amount. The true average earnings are likely to be higher and Angrist transformed the data to account for this. We can see that the difference in mean earnings is most often in between the one of pure FICA earnings and Total W-2 compensation. In column three there is again the probability difference from the last column of Table 2. As mentioned before the measure is taken from the SIPP sample for the cohort of 1950 and the DMDC/CWHS sample for the other cohorts. Angrist decides to exclude cohort 1953 and nonwhite males as for those draft eligibility does not seem to be an efficient instrument (see Table 1 and Figure 1 and 2). Although Angrist does not, in this replication I also present Table 3 for nonwhites to give the reader a broader picture. Further Angrist focuses his derivations only on the years 1981 to 1984 as those are the latest after the Vietnam war for which there was data avalaible. Effects in those years are most likely to represent long term effects.
Let us now look at the most crucial column of Table 3 which is the last one. It captures the Wald estimate for the effect of veteran status on adjusted FICA earnings in 1978 dollar terms per year and cohort from equation (3). So this is our \(\hat{\alpha}\) per year and cohort. For white males the point estimates indicate that the annual loss in real earnings due to serving in the military was around 2000 dollars. Looking at the high standard errors, though, only few of the estimates are actually statistically significant. In order to see this more clearly I added a star to those values in the last column that are statistically significant to the five percent level.
Note: In the last column I obtain slightly different standard errors than in the paper. The same is the case, though, in the replication code my replication is building up on.
[14]:
table3 = get_table3(data_cwhsa, data_cwhsb, data_dmdc, data_sipp, data_cwhsc_new)
p_value_star(table3["white"], slice(None), ("", "Service Effect in 1978 $"))
[14]:
First Level | Draft Eligibility Effects in Current $ | ||||||
---|---|---|---|---|---|---|---|
Second Level | FICA Earnings | Adjusted FICA Earnings | Total W-2 Earnings | P(V|eligible) - P(V|ineligible) | Service Effect in 1978 $ | ||
Cohort | Year | Statistic | |||||
1950 | 1981 | Value | -435.8 | -487.8 | -589.7 | 0.159 | -2195.3* |
Standard Error | 210.6 | 237.6 | 299.4 | 0.027 | 1069.5 | ||
1982 | Value | -320.2 | -396.1 | -305.5 | -1679.0 | ||
Standard Error | 235.9 | 281.7 | 345.5 | 1194.1 | |||
1983 | Value | -349.6 | -450.1 | -512.9 | -1849.3 | ||
Standard Error | 261.7 | 302.0 | 441.2 | 1240.7 | |||
1984 | Value | -484.4 | -638.8 | -1143.3 | -2517.1 | ||
Standard Error | 286.8 | 336.6 | 492.3 | 1326.3 | |||
1951 | 1981 | Value | -358.3 | -428.8 | -71.6 | 0.136 | -2258.3* |
Standard Error | 203.7 | 216.7 | 423.4 | 0.027 | 1141.2 | ||
1982 | Value | -117.3 | -278.6 | -72.8 | -1382.1 | ||
Standard Error | 229.1 | 251.5 | 372.2 | 1247.5 | |||
1983 | Value | -314.1 | -452.2 | -896.6 | -2174.4 | ||
Standard Error | 253.3 | 277.7 | 426.4 | 1335.3 | |||
1984 | Value | -398.5 | -573.4 | -809.2 | -2644.3 | ||
Standard Error | 279.3 | 308.0 | 381.0 | 1420.3 | |||
1952 | 1981 | Value | -342.9 | -392.7 | -440.5 | 0.105 | -2675.1 |
Standard Error | 206.9 | 220.3 | 265.1 | 0.027 | 1500.6 | ||
1982 | Value | -235.1 | -255.3 | -514.7 | -1638.2 | ||
Standard Error | 232.4 | 254.0 | 296.6 | 1630.1 | |||
1983 | Value | -437.7 | -500.1 | -915.7 | -3110.0 | ||
Standard Error | 257.6 | 283.3 | 395.3 | 1761.9 | |||
1984 | Value | -436.1 | -560.1 | -767.2 | -3340.9 | ||
Standard Error | 281.9 | 310.8 | 376.1 | 1853.8 |
Looking at nonwhite males now, we observe what we already expected. All of the Wald estimates are actually far away from being statistically significant.
[15]:
p_value_star(table3["nonwhite"], slice(None), ("", "Service Effect in 1978 $"))
[15]:
First Level | Draft Eligibility Effects in Current $ | ||||||
---|---|---|---|---|---|---|---|
Second Level | FICA Earnings | Adjusted FICA Earnings | Total W-2 Earnings | P(V|eligible) - P(V|ineligible) | Service Effect in 1978 $ | ||
Cohort | Year | Statistic | |||||
1950 | 1981 | Value | 534.5 | 654.0 | 802.6 | 0.06 | 7780.5 |
Standard Error | 413.6 | 495.2 | 524.6 | 0.057 | 5891.3 | ||
1982 | Value | 285.2 | 335.4 | 326.0 | 3758.5 | ||
Standard Error | 461.3 | 529.8 | 609.0 | 5937.0 | |||
1983 | Value | 96.1 | 169.1 | 315.5 | 1836.3 | ||
Standard Error | 512.6 | 551.6 | 720.0 | 5990.4 | |||
1984 | Value | -76.9 | -65.1 | -287.4 | -677.8 | ||
Standard Error | 548.2 | 601.9 | 804.1 | 6269.8 | |||
1951 | 1981 | Value | 313.2 | 401.5 | 416.0 | 0.05 | 5760.5 |
Standard Error | 419.2 | 446.6 | 745.2 | 0.06 | 6407.4 | ||
1982 | Value | 175.5 | 228.1 | -244.3 | 3081.9 | ||
Standard Error | 471.6 | 524.4 | 647.8 | 7087.0 | |||
1983 | Value | 419.6 | 398.9 | 254.3 | 5224.8 | ||
Standard Error | 538.2 | 558.8 | 767.6 | 7318.6 | |||
1984 | Value | -223.2 | -293.5 | -718.6 | -3687.0 | ||
Standard Error | 562.9 | 598.1 | 771.6 | 7513.4 | |||
1952 | 1981 | Value | -305.9 | -316.5 | -272.4 | 0.016 | -14104.0 |
Standard Error | 429.1 | 454.8 | 492.9 | 0.061 | 20262.8 | ||
1982 | Value | -262.6 | -502.6 | -160.2 | -21092.7 | ||
Standard Error | 476.8 | 524.1 | 590.0 | 21993.9 | |||
1983 | Value | -177.3 | -275.9 | -53.6 | -11221.1 | ||
Standard Error | 531.5 | 546.6 | 643.5 | 22235.2 | |||
1984 | Value | -123.4 | -99.8 | -288.1 | -3892.0 | ||
Standard Error | 568.6 | 600.3 | 721.0 | 23420.2 |
3.3.2 More complex IV estimates
In the next step Angrist uses a more generalized version of the Wald estimate for the given data. While in the previous analysis the mean earnings were compared solely on the basis of two groups (eligibles and ineligibles, which were determined by the lottery numbers), in the following this is extended to more complex subgroups. The grouping is now based on intervals of five consecutive lottery numbers. As explained in the section on idenficication this boils down to estimating the model described in equation (2).
\(\bar y_{ctj}\) captures the mean earnings by cohort \(c\), in year \(t\) for group \(j\). \(\hat p_{cj}\) depicts the estimated probability of being a veteran conditional on being in cohort \(c\) and group \(j\). We are now interested in obtaining an estimate of \(\alpha\). In our current set up \(\alpha\) corresponds to a linear combination of the many different possible Wald estimates when comparing each of the subgroups in pairs. With this view in mind Angrist restricts the treatment effect to be same (i.e. equal to \(\alpha\)) for each comparison of subgroups. The above equation is equivalent to the second stage of the 2SLS estimation. Angrist estimates the above model using the mean real earnings averaged over the years 1981 to 84 and the cohorts from 1950 to 53. In the first stage Angrist has to estimate \(\hat p_{cj}\) which is done again by using a combination of the SIPP sample and the matched DWDC/CWHS data set. With this at hand Angrist shows how the equation (2) looks like if it was estimated by OLS. The following Figure 3 is also called Visual Instrumental Variables (VIV). In order to arrive there he takes the residuals from an OLS regression of \(\bar y_{ctj}\) and \(\hat p_{cj}\) on cohort and time dummies, respectively. Then he performs another OLS regression of the earnings residuals on the probability residuals. This is depicted in Figure 3 below. The slope of the regression line corresponds to an IV estimate of \(\alpha\). The slope amounts to an estimate of -2384 dollars which serves as a reference for the treatment effect measured by another, more efficient method described below the Figure.
[16]:
get_figure3(data_cwhsc_new)
We now shortly turn back to a remark from before. Angrist is forced to only work with sample means due to confidentiality restrictions on the underlying micro data. For the Wald estimates it is somewhat easily imaginable that this does not pose any problem. For the above estimation of \(\alpha\) using 2SLS this is less obvious. Angrist argues, though, that there is a Generalized Method of Moments (GMM) interpretation of the 2SLS approach which allows him to work with sample moments alone. Another important implication thereof is that he is not restricted to using only one sample to obtain the sample moments. In our concrete case here, it is therefore unproblematic that the earnings data is coming from another sample than the conditional probabilities of being a veteran as both of the samples are drawn from the same population. This is a characteristic of the GMM.
In the following, Angrist estimates equation (2) by using the more efficient approach of Generalized Least Squares (GLS) as opposed to OLS. The GLS is more efficient if there is correlation between the residuals in a regression model. Angrist argues that this is the case in the above model equation and that this correlation can be estimated. GLS works such that coming from the estimated covariance matrix \(\hat\Omega\) of the residuals, the regressors as well as the dependent variable are transformed using the upper triangle of the Cholesky decomposition of \(\hat\Omega^{-1}\). Those transformed variables are then used to run a regular OLS model with nonrobust standard errors. The resulting estimate \(\hat\alpha\) then is the most efficient one (if it is true that there is correlation between the residuals).
Angrist states that the optimal weigthing matrix \(\Omega\) resulting in the most efficient estimate of \(\hat\alpha\) looks the following:
All of the three elements on the right hand side can be estimated from the data at hand.
Now we have all the ingredients to have a look at the results in Table 4. In practice, Angrist estimates two models in the above manner based on the general form of the above regression equation. Model 1 allows the treatment effect to vary by cohort while Model 2 collapses them into a scalar estimate of \(\alpha\). The results for white men in Model 1 show that for each of the three earnings measures as dependent variable only few are statistically significant to the five percent level (indicated by a star added by me again). A look at Model 2 reveals, though, that the combined treatment effect is significant and it amounts to a minus of 2000 dollar (we look again at real earnings in 1978 dollar terms) annualy for those having served in the army. For cohort 1953 we obtain insignificant estimates which was to be expected given that actually nobody was drafted in that year.
Note: The results are again a bit different to those in the paper. The same is the case, though, in the replication code my replication is building up on.
[17]:
table4 = get_table4(data_cwhsc_new)
p_value_star(
table4["white"], (slice(None), slice(None), ["Value", "Standard Error"]), (slice(None)),
)
[17]:
FICA Taxable Earnings | Adjusted FICA Earnings | Total W-2 Compensation | |||
---|---|---|---|---|---|
Model | Cohort | Statistic | |||
Model 1 | 1950 | Value | -1709.2 | -2093.7 | -1895.0 |
Standard Error | 946.8 | 1109.2 | 1336.9 | ||
1951 | Value | -1457.1 | -1983.7 | -2431.4 | |
Standard Error | 954.7 | 1036.5 | 1155.4 | ||
1952 | Value | -1724.0 | -1943.0* | -2058.7* | |
Standard Error | 863.3 | 927.5 | 1004.8 | ||
1953 | Value | 1223.8 | 900.7 | -488.6 | |
Standard Error | 3232.5* | 3506.6* | 3947.4* | ||
Chi Squared | 578.3 | 630.3 | 569.5 | ||
Model 2 | 1950-53 | Value | -1562.9 | -1920.4 | -2094.5 |
Standard Error | 521.7 | 576.8 | 649.1 | ||
Chi Squared | 579.1 | 631.0 | 569.7 |
Angrist also reports those estimates for nonwhite men which are not significant. This was already expected as the the instrument was not clearly correlated with the endogenous variable of veteran status.
[18]:
p_value_star(
table4["nonwhite"], (slice(None), slice(None), ["Value", "Standard Error"]), (slice(None)),
)
[18]:
FICA Taxable Earnings | Adjusted FICA Earnings | Total W-2 Compensation | |||
---|---|---|---|---|---|
Model | Cohort | Statistic | |||
Model 1 | 1950 | Value | 3893.7* | 3871.9* | 5711.8* |
Standard Error | 5355.1 | 6246.9 | 7206.5 | ||
1951 | Value | -891.3 | -333.4 | 2609.0 | |
Standard Error | 4399.6 | 4667.1 | 4887.1 | ||
1952 | Value | -3182.9 | -3457.7 | -3068.0 | |
Standard Error | 3994.9 | 4194.9 | 4222.7 | ||
1953 | Value | -5928.3 | -8571.5 | -6325.8 | |
Standard Error | 10302.3* | 10652.6* | 11393.0* | ||
Chi Squared | 616.7 | 681.7 | 693.6 | ||
Model 2 | 1950-53 | Value | -643.3 | -999.7 | 367.8 |
Standard Error | 2406.8 | 2602.6 | 2733.8 | ||
Chi Squared | 618.4 | 683.4 | 695.6 |
This table concludes the replication of the core results of the paper. Summing up, Angrist constructed a causal graph for which he employs a plausible estimation strategy. Using his approach he concludes with the main result of having found a negative effect of serving in the military during the Vietnam era on subsequent earnings for white male in the United States.
Angrist provides some interpretation of the found effect and some concerns that might arise when reading his paper. I will discuss some of his points in the following critical assessment.
4. Critical Assessment
Considering the time back then and the consequently different state of research, the paper was a major contribution to instrumental variable estimation of treatment effects. More broadly, the paper is very conclusive and well written. Angrist discusses caveats quite thoroughly which makes the whole argumentation at first glance very concise. Methodologically, the paper is quite complex as due to the kind of data available. Angrist is quite innovative in that regard as he comes up with the two sample IV method in this paper which allows him to pratically follow his identification strategy. The attempt to explain the mechanisms behind the negative treatment effect found by him makes the paper comprehensive and shows the great sense of detail Angrist put into this paper.
While keeping in mind the positive sides of his paper, in hindsight, Angrist is a bit too vocal about the relevance and accuracy of his findings. Given our knowledge about the local average treatment effect (LATE) we encountered in our lecture, Angrist only identifies the average treatment effect of the compliers (those that enroll for the army if they are draft-eligible but do not if they are not) if there is individual level treatment heterogeneity and if the causal graph from before is accurate. Hence, the interpretation of the results gives only limited policy implications. For the discussion of veteran compensation the group of those who were induced by the lottery to join the military are not crucial. As there is no draft lottery anymore, what we are interested in is how to compensate veterans for their service who “voluntarily” decided to serve in the military. This question cannot be answered by Angrist’s approach given the realistic assumption that there is treatment effect heterogeneity (which also Angrist argues might be warranted).
A related difficulty of interpretation arises because in the second part, Angrist uses an overidentified model. As already discussed before this amounts to a linear combination of the average treatment effects of subgroups. This mixes the LATEs of several subgroups making the policy implications even more blurred as it is not clear what the individual contributions of the different subgroups are. In this example here this might not make a big difference but should be kept in mind when using entirely different instrumental variables to identify the LATE.
In a last step, there are several possible scenarios to argue why the given causal graph might be violated. Angrist himself delivers one of them. After the lottery numbers were drawn, there was some time in between the drawing and the announcement of the draft-eligibility ceiling. This provoked behavioral responses of some individuals with low numbers to volunteer for the army in order to get better terms of service as well as enrolling in university which rendered them ineligible for the army. In our data, it is unobservable to see the fraction of individuals in each group to join university. If there was actually some avoidance behavior for those with low lottery numbers, then the instrument would be questionable as there would be a path from the Draft Lottery to unobservables (University) which affects earnings. At the same time there is also clearly a relation between University and Military Service.
Rosenzweig and Wolpin (2000) provide a causal graph that draws the general interpretability of the results in Angrist (1990) further into question. Let us look at the causal graph below now imagining that there was no directed graph from Draft Lottery to Civilian Experience. Their argument is that Military Service reduces Schooling and Civilian Experience which lowers Wages while affecting Wages directly and increasing them indirectly by reducing Schooling and increasing work experience. Those subtle mechanism are all collapsed into one measure by Angrist which gives an only insufficiently shallow answer to potentially more complex policy questions. Building up on this causal graph, Heckman (1997) challenges the validity of the instrument in general by making the point that there might be a directed graph from Draft Lottery to Civilian Experience. The argument goes as follows: Employers, after learning about their employees’ lottery numbers, decrease the training on the job for those with a high risk of being drafted. If this is actually warranted the instrument Draft Lottery cannot produce unbiased estimates anymore.
Morgan and Winship (2014) add to this that the bias introduced by this is further affected by how strongly Draft Lottery affects Military Service. Given the factor that the lottery alone does not determine military service but that there are tests, might cause the instrument to be rather weak and therefore a potential bias to be rather strong.
5. Extensions
5.1 Treatment effect with different years of earning
In the calculation of the average treatment in Table 4 Angrist chooses to calculate it for earnings in the years from 1981 to 84. While he plausibly argues that this most likely constitutes a long term effect (as those are the last years for which he has data) in comparison to earlier years, it does not give a complete picture. Looking at Table 1 again we can see that for the earnings differences in the years 81 to 84 quite big estimates are calculated. Assuming that the difference in probability of serving given eligibility versus noneligibility stays somewhat stable across the years, we would expect some heterogeneity in average treatment effects depending on which years we use the earnings data of. Angrist, though, does not investigate this although he has the data for it at hand. For example from a policy perspective one could easily argue that a look at the average treatment effect for earlier years (close to the years in which treatment happens) might be more relevant than the one for years after. This is because given the long time between the actual service and the earnings data of 1981 to 84 it is likely that second round effects are driving some of the results. These might be initially caused by veteran status but for later years the effect of veteran status might mainly act by means of other variables. For instance veterans after the war might be forced to take simple jobs due to their lack of work experience and from then on their path is determined by the lower quality of the job that they had to take right after war. For policy makers it might be of interest to see what happens to veterans right after service to see what needs to be done in order to stop second round effects from happening in the first place.
To give a more wholesome image, I estimate the results for Table 4 for different years of earnings of white men. As mentioned before the quality of the Total W-2 data set is rather low and the adjusted FICA is more plausible than the FICA data. This is why I only use the adjusted FICA data in the following. For the adjusted FICA I have data for Table 4 for the years from 1974 to 1984. For each possible four year range within those ten years I estimate Model 1 and 2 from Table 4 again.
Below I plot the average treatment effects obtained. On the x-axis I present the starting year of the range of the adjusted FICA data used. For starting value 74 it means that the average treatment effect is calculated for earnings data of the years 1974 to 77. The results at the starting year 81 are equivalent to the ones found by Angrist in Table 4 for white men.
[19]:
# get the average treatment effects of Model 1 and 2 with adjusted FICA earnings for
# several different ranges of four years
results_model1 = np.empty((8, 4))
results_model2 = np.array([])
for number, start_year in enumerate(np.arange(74, 82)):
years = np.arange(start_year, start_year + 4)
flex_table4 = get_flexible_table4(data_cwhsc_new, years, ["ADJ"], [50, 51, 52, 53])
results_model1[number, :] = (
flex_table4["white"].loc[("Model 1", slice(None), "Value"), :].values.flatten()
)
results_model2 = np.append(
results_model2, flex_table4["white"].loc[("Model 2", slice(None), "Value"), :].values,
)
[20]:
# Plot the effects for white men in Model 1 and 2
# (colors apart from Cohort 1950 are random, execute again to
# change them)
get_figure1_extension1(results_model1, results_model2)
The pattern is more complex than what we can see in the glimpse of Table 4 in the paper. We can see that there is quite some heterogeneity in average treatment effects across cohorts when looking at the data for early years. This changes when using data of later years. Further the fact of being a veteran does seem to play a role for the cohort 1953 right after the war but the treatment effect becomes insignificant when looking at later years. This is interesting as the cohort of 1953 was the one for which no one was drafted (remember that in 1973 no one was drafted as the last call was in December 1972).
Another observation is linked to the fact that draft eligibility does not matter for those born in 1953. These people appear to have voluntarily joined the army as no one of them could have possibly been drafted. This cannot be said for the cohorts before. Employers can only observe whether a person is a veteran and when they are born (and not if they are compliers or not). A theory could be that employers act on the loss of experience for initial wage setting for every army veteran right after the war. The fact that the cohort of 1953 could only be volunteers but not draftees could give them a boost in social status to catch up again in the long run, though. This mechanism might explain to a certain extent why we observe the upward sloping line for the cohort of 1953 (but not for the other groups).
As discussed in the critical assessment, we actually only capture the local average treatment effect of the compliers. Those are the ones who join the army when they are draft-eligible but do not when they are not. The identifying assumption for the LATE requires that everyone is a complier. This is probably not warranted for the cohort of 1953. In that year it is easily imaginable that there are both defiers and compliers which means that we do not capture the LATE for cohort 1953 in Model 1 and for cohort 1950-53 in Model 2 but something else we do not really know how to interpret. This might be another reason why we observe this peculiar pattern for the cohort of 1953. Following up on this remark I estimate the Model 2 again excluding the cohort of 1953 to focus on the cohorts for which the assumptions for LATE are likely to hold.
[21]:
results_model2_53 = np.array([])
for number, start_year in enumerate(np.arange(74, 82)):
years = np.arange(start_year, start_year + 4)
flex_table4 = get_flexible_table4(data_cwhsc_new, years, ["ADJ"], [50, 51, 52])
results_model2_53 = np.append(
results_model2_53, flex_table4["white"].loc[("Model 2", slice(None), "Value"), :].values,
)
[22]:
get_figure2_extension1(results_model2, results_model2_53)
We can see that for later years the treatment effect is a bit lower when excluding the cohort of 1953. It confirms the findings of Angrist with the advantage of making it possible to attach a clearer interpretation to it.
Following the above path, it would also be interesting to vary the amount of instruments used by more than just the two ways Angrist has shown. It would be interesting to break down the interval size of lottery numbers further. Unfortunately I could no find a way to do that with the already pre-processed data I have at hand.
5.2 Bias Quantification
In the critical assessment I argued that the simple Wald estimate might be biased because employers know their employees’ birth date and hence their draft eligibility. The argument was that employers invest less into the human capital of those that might be drafted. This would cause the instrument of draft eligibility to not be valid and hence suffer from bias. This bias can be calculated in the following way for a binary instrument:
What has been done in the last column of Table 3 (the Wald estimate) is that Angrist calculated the left hand side of this equation. This calculation yields an unbiased estimate of the treatment effect of \(D\) (veteran status) on \(Y\) (earnings) \(\delta\) if there is no effect of the instrument \(Z\) (draft eligibility) on \(Y\) through means of unobservables \(\epsilon\). In our argumentation this assumption does not hold which means that \(E[\epsilon|Z=1] - E[\epsilon|Z=0]\) is not equal to zero as draft eligibility affects \(Y\) by the behavioral change of employers to make investing into human capital dependent on draft eligibility. Therefore the left hand side calculation is not equal to the true treatment effect \(\delta\) but has to be adjusted by the bias \(\frac{E[\epsilon|Z=1] - E[\epsilon|Z=0]}{E[D|Z=1] - E[D|Z=0]}\).
In this section I run a thought experiment in which I quantify this bias. The argumentation here is rather heuristic because I lack the resources to really find a robust estimate of the bias but it gives a rough idea of whether the bias might matter economically. My idea is the following. In order to get a measure of \(E[\epsilon|Z=1] - E[\epsilon|Z=0]\) I have a look at estimates for the effect of work experience on earnings. Remember that the expected difference in earnings due to a difference in draft eligibility is caused by a loss in human capital for those draft eligible because they might miss out on on-the-job-training. This loss in on-the-job-training could be approximated by a general loss in working experience. For an estimate of that effect I rely on Keane and Wolpin (1997) who work with a sample for young men between 14 and 21 years old from the year 1979. The effect of working experience on real earnings could be at least not far off of the possible effect in our sample of adjusted FICA real earnings of 19 year old men for the years 1981 to 1984. Remember that lottery participants find out about whether they are draft eligible or not at the end of the year before they might be drafted. I assume that draft dates are spread evenly over the draft year. One could then argue that on average a draft eligible person stays in his job for another half a year after having found out about the eligibility and before being drafted. Hence, for on average half a year an employer might invest less into the human capital of this draft eligible man. I assume now that employers show a quite moderate behavioral response. During the six months of time, the employees only receive a five month equivalent of human capital gain (or work experience gain) as opposed to the six months they stay in the company. This means they loose one month of work experience on average in comparison to those that are not draft eligible.
To quantify this one month loss of work experience I take estimates from Keane and Wolpin (1997). For blue collar workers they roughly estimate the gain in real earnings in percent from an increase in a year of blue collar work experience to be 4.6 percent (actually their found effect depends on the years of work experience but I simplify this for my rough calculations). For white collar workers the equivalent estimate amounts to roughly 2.7 percent. I now take those as upper and lower bounds, calculate their one month counterparts and quantify the bias in the Wald estimates of the last column of Table 3. The bias \(\frac{E[\epsilon|Z=1] - E[\epsilon|Z=0]}{E[D|Z=1] - E[D|Z=0]}\) is then roughly equal to the loss in annual real earnings due to one month less of work experience divided by the difference in probability of being a veteran conditional on draft eligibility.
The first table below depicts how the bias changes by cohort across the different years of real earnings with increasing estimates of how a loss in experience affects real earnings. Clearly with increasing estimates of how strong work experience contributes to real earnings, the bias gets stronger. This is logical as it is equivalent to an absolute increase in the nominator. Above that the bias is stronger for later years of earnings as the real earnings increase by year. Further the slope is steeper for later cohorts as the denominator is smaller for later cohorts. Given the still moderate assumption of a loss of one month of work experience we can see that the bias does not seem to be negligible economically especially when taking the blue collar percentage estimate.
[23]:
# Calculate the bias, the true delta and the orginal Wald estimate for a
# ceratain interval of working experience effect
interval = np.linspace(0.025, 0.05, 50) / 12
bias, true_delta, wald = get_bias(
data_cwhsa, data_cwhsb, data_dmdc, data_sipp, data_cwhsc_new, interval
)
[24]:
# plot the bias by cohort
get_figure1_extension2(bias, interval)
To get a sense of how the size of the bias relates to the size of the previously estimated Wald coefficients, let us have look at the figure below. It shows for each cell consisting of a cohort and year combination, the Wald estimate from Table 3 as the horizontal line and the true \(\delta\) depending on the weight of the loss in work experience as the upward sloping line. Given that our initial estimates of the Wald coefficients are in a range of only a few thousands, an estimated bias of roughly between 200 and 500 dollars cannot be characterized as incosiderable. Further given Angrist’s policy question concerning Veteran compensation, even an estimate that is higher by 200 dollars makes a big difference when it is about compensating thousands of veterans.
[25]:
# plot the the true delta (accounted for the bias) compared to the original Wald estimate
get_figure2_extension2(true_delta, wald, interval)
6. Conclusion
Regarding the overall quality and structure of Angrist (1990), reading it is a real treat. The controversy after its publication and the fact that it is highly cited clearly show how important its contribution was and still is. It is a great piece of discussion when it comes to the interpretability and policy relevance of instrumental variable approaches. As already reiterated in the critical assessment, one has to acknowledge the care Angrist put into this work. Although his results do not seem to prove reliable, it opened a whole discussion on how to use instrumental variables to get the most out of them. Another contribution that should not go unnoticed is that Angrist shows that instruments can be used even though they might not come from the same sample as the dependent and the endogenous variable. Practically, this is very useful as it widens possible areas of application for instrumental variables.
Overall, it has to be stated that the paper has some shortcomings but the care put into this paper and the good readibility allowed other researchers (and Angrist himself) to swoop in giving helpful remarks that improved the understanding of instrumental variable approaches for treatment effect evaluation.
References
Angrist, J. (1990). Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records. American Economic Review. 80. 313-36.
Angrist, J. D., & Pischke, J.-S. (2009). Mostly harmless econometrics: An empiricist’s companion.
Heckman, J. (1997). Instrumental Variables: A Study of Implicit Behavioral Assumptions Used in Making Program Evaluations. The Journal of Human Resources, 32(3), 441-462. doi:10.2307/146178
Keane, M., & Wolpin, K. (1997). The Career Decisions of Young Men. Journal of Political Economy, 105(3), 473-522. doi:10.1086/262080
Morgan, S., and Winship, C. (2014). Counterfactuals and Causal Inference: Methods and Principles for Social Research (Analytical Methods for Social Research). Cambridge: Cambridge University Press. doi:10.1017/CBO9781107587991
Rosenzweig, M. R. and Wolpin, K. I.. (2000). “Natural ‘Natural Experiments’ in Economics.” Journal of Economic Literature 38:827–74.
Wald, A. (1940). The Fitting of Straight Lines if Both Variables are Subject to Error. Ann. Math. Statist. 11 , no. 3, 284–300.
Appendix
Key Variables in the Data Sets
data_cwhsa
Name |
Description |
---|---|
index |
|
byr |
birth year |
race |
ethnicity, 1 for white and 2 for nonwhite |
interval |
interval of draft lottery numbers, 73 intervals with the size of five consecutive numbers |
year |
year for which earnings are collected |
variables |
|
vmn1 |
nominal earnings |
vfin1 |
fraction of people with zero earnings |
vnu1 |
sample size |
vsd1 |
standard deviation of earnings |
data_cwhsb
Name |
Description |
---|---|
index |
|
byr |
birth year |
race |
ethnicity, 1 for white and 2 for nonwhite |
interval |
interval of draft lottery numbers, 73 intervals with the size of five consecutive numbers |
year |
year for which earnings are collected |
type |
source of the earnings data, “TAXAB” for FICA and “TOTAL” for Total W-2 |
variables |
|
vmn1 |
nominal earnings |
vfin1 |
fraction of people with zero earnings |
vnu1 |
sample size |
vsd1 |
standard deviation of earnings |
data_cwhsc_new
Name |
Description |
---|---|
index |
|
byr |
birth year |
race |
ethnicity, 1 for white and 2 for nonwhite |
interval |
interval of draft lottery numbers, 73 intervals with the size of five consecutive numbers |
year |
year for which earnings are collected |
type |
source of the earnings data, “ADJ” for adjusted FICA, “TAXAB” for FICA and “TOTAL” for Total W-2 |
variables |
|
earnings |
real earnings in 1978 dollars |
nj |
sample size |
nj0 |
number of persons in the sample with zero earnings |
iweight_old |
weight for weighted least squares |
ps_r |
fraction of people having served in the army |
ern74 to ern84 |
unweighted covariance matrix of the real earnings |
data_dmdc
Name |
Description |
---|---|
index |
|
byr |
birth year |
race |
ethnicity, 1 for white and 2 for nonwhite |
interval |
interval of draft lottery numbers, 73 intervals with the size of five consecutive numbers |
variables |
|
nsrvd |
number of people having served |
ps_r |
fraction of people having served |
data_sipp (this is the only micro data set)
Name |
Description |
---|---|
index |
|
u_brthyr |
birth year |
nrace |
ethnicity, 0 for white and 1 for nonwhite |
variables |
|
nvstat |
0 if man is not a veteran, 1 if he is |
fnlwgt_5 |
fraction of people with this index among overall sample |
rsncode |
1 if person was draft eligible, else if not |