Session 5
PMAP 8521: Program evaluation
Andrew Young School of Policy Studies
do()ing observational
causal inference
do()ing observational
causal inference
Potential outcomes
The relationship between nodes can be described with equations
Loc=fLoc(U1)Bkgd=fBkgd(U1)JobCx=fJobCx(Edu)Edu=fEdu(Req,Loc,Year)Earn=fEarn(Edu,Year,Bkgd,Loc,JobCx)
dagify()
in ggdag forces you to think this way
Earn=fEarn(Edu,Year,Bkgd,Loc,JobCx)Edu=fEdu(Req,Loc,Year)JobCx=fJobCx(Edu)Bkgd=fBkgd(U1)Loc=fLoc(U1)
dagify( Earn ~ Edu + Year + Bkgd + Loc + JobCx, Edu ~ Req + Loc + Bkgd + Year, JobCx ~ Edu, Bkgd ~ U1, Loc ~ U1)
All these nodes are related; there's correlation between them all
We care about
Edu → Earn, but what do we do about all the other nodes?
A causal effect is identified if the association between treatment and outcome is propertly stripped and isolated
Arrows in a DAG transmit associations
You can redirect and control those paths by "adjusting" or "conditioning"
Confounding
Common cause
Causation
Mediation
Collision
Selection /
endogeneity
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
P = probability distribution, or E = expectation/expected value
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
P = probability distribution, or E = expectation/expected value
Y = outcome, X = treatment;
x = specific value of treatment
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[ Juvenile delinquency | do(Truancy program)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[ Juvenile delinquency | do(Truancy program)]
E[ Malaria infection rate | do(Mosquito net)]
When you do() X, delete all arrows into it
When you do() X, delete all arrows into it
Observational DAG
When you do() X, delete all arrows into it
Observational DAG
Experimental DAG
E[Earnings | do(College education)]
E[Earnings | do(College education)]
Observational DAG
E[Earnings | do(College education)]
Observational DAG
Experimental DAG
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
P[Y | do(X)]≠P(Y | X)
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
P[Y | do(X)]≠P(Y | X)
Correlation isn't causation!
Our goal with observational data:
Rewrite P[Y | do(X)] so that it doesn't have a do() anymore (is "do-free")
A set of three rules that let you manipulate a DAG
in special ways to remove do() expressions
WAAAAAY beyond the score of this class!
Just know it exists and computer algorithms can do it for you!
Backdoor adjustment
Frontdoor adjustment
P[Y | do(X)]=∑ZP(Y | X,Z)×P(Z)
↑ That's complicated!
The right-hand side of the equation means "the effect of X on Y after adjusting for Z"
There's no do() on that side!
S → T is d-separated; T → C is d-separated
combine the effects to find S → C
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
Backdoor adjustment is easiest to see +
dagitty and ggdag do this for you!
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
Backdoor adjustment is easiest to see +
dagitty and ggdag do this for you!
Fancy algorithms (found in the causaleffect package)
can do the official do-calculus for you too
Causal effect = δ (delta)
δ=P[Y | do(X)]
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
δ=(Y | X=1)−(Y | X=0)
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
δ=(Y | X=1)−(Y | X=0)
δ=Y1−Y0
Fundamental problem
of causal inference
δi=Y1i−Y0iin real life isδi=Y1i−???
Individual-level effects are impossible to observe!
There are no individual counterfactuals!
Solution: Use averages instead
ATE=E(Y1−Y0)=E(Y1)−E(Y0)
Solution: Use averages instead
ATE=E(Y1−Y0)=E(Y1)−E(Y0)
Difference between average/expected value when
program is on vs. expected value when program is off
δ=(ˉY | P=1)−(ˉY | P=0)
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉY | P=1)−(ˉY | P=0)
ATE=20+5+5+5+10+0+−10+58=5
ATE in subgroups
ATE in subgroups
Is the program more
effective for specific age groups?
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉYO | P=1)−(ˉYO | P=0)
δ=(ˉYY | P=1)−(ˉYY | P=0)
CATEOld=20+5+5+104=10
CATEYoung=5+0−10+54=0
Average treatment on the treated
ATT / TOT
Effect for those with treatment
Average treatment on the treated
ATT / TOT
Effect for those with treatment
Average treatment on the untreated
ATU / TUT
Effect for those without treatment
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉYT | P=1)−(ˉYT | P=0)
δ=(ˉYU | P=1)−(ˉYU | P=0)
CATETreated=20+5+5+54=8.75
CATEUntreated=10+0−10+54=1.25
The ATE is the weighted average
of the ATT and ATU
The ATE is the weighted average
of the ATT and ATU
ATE=(πTreated×ATT)+(πUntreated×ATU)
(48×8.75)+(48×1.25)
4.375+0.625=5
π here means "proportion," not 3.1415
ATE and ATT aren't always the same
ATE = ATT + Selection bias
5=8.75+xx=−3.75
Randomization fixes this, makes x = 0
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
Treatment not
randomly assigned
We can't see
unit-level causal effects
What do we do?!
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
Treatment seems to be correlated with age
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
We can estimate the ATE by finding the weighted average of age-based CATEs
As long as we assume/pretend treatment was randomly assigned within each age = unconfoundedness
^ATE=πOld^CATEOld+πYoung^CATEYoung
^ATE=πOld^CATEOld+πYoung^CATEYoung
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
^CATEOld=80+75+853−601=20
^CATEYoung=751−80+100+803=−11.667
^ATE=(48×20)+(48×−11.667)=4.1667
^ATE=^CATETreated−^CATEUntreated
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
^CATETreated=80+75+85+754=78.75
^CATEUntreated=60+80+100+804=80
^ATE=78.75−80=−1.25
You can only do this if treatment is random!
^ATE=πOld^CATEOld+πYoung^CATEYoung
We used age here because it correlates with (and confounds) the outcome
And we assumed unconfoundedness;
that treatment is
randomly assigned within the groups
Does attending a private university cause an increase in earnings?
This is tempting!
Average private − Average public
110+100+60+115+755=92110+30+90+604=72.5(92×59)−(72.5×49)=18,888
This is wrong!
^ATE=πPrivate^CATEPrivate+πPublic^CATEPublic
These groups look like they have similar characteristics
Unconfoundedness?
CATE Group A + CATE Group B
110+1002−110=−5,00060−30=30,000(−5×35)+(30×25)=9,000
This is less wrong!
^ATE=πGroup A^CATEGroup A+πGroup B^CATEGroup B
Earnings=α+β1Private+β2Group+ϵ
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 40000 | 11952.29 | 3.35 | 0.08 |
privateTRUE | 10000 | 13093.07 | 0.76 | 0.52 |
group_ATRUE | 60000 | 13093.07 | 4.58 | 0.04 |
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 40000 | 11952.29 | 3.35 | 0.08 |
privateTRUE | 10000 | 13093.07 | 0.76 | 0.52 |
group_ATRUE | 60000 | 13093.07 | 4.58 | 0.04 |
β1 = $10,000 This is less wrong! Significance details!
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
o | Tile View: Overview of Slides |
Esc | Back to slideshow |
Session 5
PMAP 8521: Program evaluation
Andrew Young School of Policy Studies
do()ing observational
causal inference
do()ing observational
causal inference
Potential outcomes
The relationship between nodes can be described with equations
Loc=fLoc(U1)Bkgd=fBkgd(U1)JobCx=fJobCx(Edu)Edu=fEdu(Req,Loc,Year)Earn=fEarn(Edu,Year,Bkgd,Loc,JobCx)
dagify()
in ggdag forces you to think this way
Earn=fEarn(Edu,Year,Bkgd,Loc,JobCx)Edu=fEdu(Req,Loc,Year)JobCx=fJobCx(Edu)Bkgd=fBkgd(U1)Loc=fLoc(U1)
dagify( Earn ~ Edu + Year + Bkgd + Loc + JobCx, Edu ~ Req + Loc + Bkgd + Year, JobCx ~ Edu, Bkgd ~ U1, Loc ~ U1)
All these nodes are related; there's correlation between them all
We care about
Edu → Earn, but what do we do about all the other nodes?
A causal effect is identified if the association between treatment and outcome is propertly stripped and isolated
Arrows in a DAG transmit associations
You can redirect and control those paths by "adjusting" or "conditioning"
Confounding
Common cause
Causation
Mediation
Collision
Selection /
endogeneity
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
P = probability distribution, or E = expectation/expected value
do-operator
Making an intervention in a DAG
P[Y | do(X=x)]orE[Y | do(X=x)]
P = probability distribution, or E = expectation/expected value
Y = outcome, X = treatment;
x = specific value of treatment
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[ Juvenile delinquency | do(Truancy program)]
E[Y | do(X=x)]
E[ Earnings | do(One year of college)]
E[ Firm growth | do(Government R&D funding)]
E[ Air quality | do(Carbon tax)]
E[ Juvenile delinquency | do(Truancy program)]
E[ Malaria infection rate | do(Mosquito net)]
When you do() X, delete all arrows into it
When you do() X, delete all arrows into it
Observational DAG
When you do() X, delete all arrows into it
Observational DAG
Experimental DAG
E[Earnings | do(College education)]
E[Earnings | do(College education)]
Observational DAG
E[Earnings | do(College education)]
Observational DAG
Experimental DAG
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
P[Y | do(X)]≠P(Y | X)
We want to know P[Y | do(X)]
but all we have is
observational data X, Y, and Z
P[Y | do(X)]≠P(Y | X)
Correlation isn't causation!
Our goal with observational data:
Rewrite P[Y | do(X)] so that it doesn't have a do() anymore (is "do-free")
A set of three rules that let you manipulate a DAG
in special ways to remove do() expressions
WAAAAAY beyond the score of this class!
Just know it exists and computer algorithms can do it for you!
Backdoor adjustment
Frontdoor adjustment
P[Y | do(X)]=∑ZP(Y | X,Z)×P(Z)
↑ That's complicated!
The right-hand side of the equation means "the effect of X on Y after adjusting for Z"
There's no do() on that side!
S → T is d-separated; T → C is d-separated
combine the effects to find S → C
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
Backdoor adjustment is easiest to see +
dagitty and ggdag do this for you!
If you can transform do() expressions to
do-free versions, you can legally make causal inferences from observational data
Backdoor adjustment is easiest to see +
dagitty and ggdag do this for you!
Fancy algorithms (found in the causaleffect package)
can do the official do-calculus for you too
Causal effect = δ (delta)
δ=P[Y | do(X)]
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
δ=(Y | X=1)−(Y | X=0)
Causal effect = δ (delta)
δ=P[Y | do(X)]
δ=E[Y | do(X)]−E[Y | ^do(X)]
δ=(Y | X=1)−(Y | X=0)
δ=Y1−Y0
Fundamental problem
of causal inference
δi=Y1i−Y0iin real life isδi=Y1i−???
Individual-level effects are impossible to observe!
There are no individual counterfactuals!
Solution: Use averages instead
ATE=E(Y1−Y0)=E(Y1)−E(Y0)
Solution: Use averages instead
ATE=E(Y1−Y0)=E(Y1)−E(Y0)
Difference between average/expected value when
program is on vs. expected value when program is off
δ=(ˉY | P=1)−(ˉY | P=0)
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉY | P=1)−(ˉY | P=0)
ATE=20+5+5+5+10+0+−10+58=5
ATE in subgroups
ATE in subgroups
Is the program more
effective for specific age groups?
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉYO | P=1)−(ˉYO | P=0)
δ=(ˉYY | P=1)−(ˉYY | P=0)
CATEOld=20+5+5+104=10
CATEYoung=5+0−10+54=0
Average treatment on the treated
ATT / TOT
Effect for those with treatment
Average treatment on the treated
ATT / TOT
Effect for those with treatment
Average treatment on the untreated
ATU / TUT
Effect for those without treatment
Person | Age | Treated | Outcome with program |
Outcome without program |
Effect |
---|---|---|---|---|---|
1 | Old | TRUE | 80 | 60 | 20 |
2 | Old | TRUE | 75 | 70 | 5 |
3 | Old | TRUE | 85 | 80 | 5 |
4 | Old | FALSE | 70 | 60 | 10 |
5 | Young | TRUE | 75 | 70 | 5 |
6 | Young | FALSE | 80 | 80 | 0 |
7 | Young | FALSE | 90 | 100 | -10 |
8 | Young | FALSE | 85 | 80 | 5 |
δ=(ˉYT | P=1)−(ˉYT | P=0)
δ=(ˉYU | P=1)−(ˉYU | P=0)
CATETreated=20+5+5+54=8.75
CATEUntreated=10+0−10+54=1.25
The ATE is the weighted average
of the ATT and ATU
The ATE is the weighted average
of the ATT and ATU
ATE=(πTreated×ATT)+(πUntreated×ATU)
(48×8.75)+(48×1.25)
4.375+0.625=5
π here means "proportion," not 3.1415
ATE and ATT aren't always the same
ATE = ATT + Selection bias
5=8.75+xx=−3.75
Randomization fixes this, makes x = 0
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
Treatment not
randomly assigned
We can't see
unit-level causal effects
What do we do?!
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
Treatment seems to be correlated with age
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
We can estimate the ATE by finding the weighted average of age-based CATEs
As long as we assume/pretend treatment was randomly assigned within each age = unconfoundedness
^ATE=πOld^CATEOld+πYoung^CATEYoung
^ATE=πOld^CATEOld+πYoung^CATEYoung
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
^CATEOld=80+75+853−601=20
^CATEYoung=751−80+100+803=−11.667
^ATE=(48×20)+(48×−11.667)=4.1667
^ATE=^CATETreated−^CATEUntreated
Person | Age | Treated | Actual outcome |
---|---|---|---|
1 | Old | TRUE | 80 |
2 | Old | TRUE | 75 |
3 | Old | TRUE | 85 |
4 | Old | FALSE | 60 |
5 | Young | TRUE | 75 |
6 | Young | FALSE | 80 |
7 | Young | FALSE | 100 |
8 | Young | FALSE | 80 |
^CATETreated=80+75+85+754=78.75
^CATEUntreated=60+80+100+804=80
^ATE=78.75−80=−1.25
You can only do this if treatment is random!
^ATE=πOld^CATEOld+πYoung^CATEYoung
We used age here because it correlates with (and confounds) the outcome
And we assumed unconfoundedness;
that treatment is
randomly assigned within the groups
Does attending a private university cause an increase in earnings?
This is tempting!
Average private − Average public
110+100+60+115+755=92110+30+90+604=72.5(92×59)−(72.5×49)=18,888
This is wrong!
^ATE=πPrivate^CATEPrivate+πPublic^CATEPublic
These groups look like they have similar characteristics
Unconfoundedness?
CATE Group A + CATE Group B
110+1002−110=−5,00060−30=30,000(−5×35)+(30×25)=9,000
This is less wrong!
^ATE=πGroup A^CATEGroup A+πGroup B^CATEGroup B
Earnings=α+β1Private+β2Group+ϵ
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 40000 | 11952.29 | 3.35 | 0.08 |
privateTRUE | 10000 | 13093.07 | 0.76 | 0.52 |
group_ATRUE | 60000 | 13093.07 | 4.58 | 0.04 |
Earnings=α+β1Private+β2Group+ϵ
model_earnings <- lm(earnings ~ private + group_A, data = schools_small)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 40000 | 11952.29 | 3.35 | 0.08 |
privateTRUE | 10000 | 13093.07 | 0.76 | 0.52 |
group_ATRUE | 60000 | 13093.07 | 4.58 | 0.04 |
β1 = $10,000 This is less wrong! Significance details!