D, which is binary (0 or 1)D -> Y in this (simplified) diagram:W is one way of controlling for WD=1), look at their Ws, and pick non-treated observations with similar (or identical) values of WEvery matching estimator follows the same basic concept:
W1, W2, etc., to match onY vs. the average untreated Y, counting untreated obs more heavily the closer they areMany many many ways to do 3 and 4. Here’s one…
W, what you do is:
a. The smaller it is, the closer the match, but the smaller your eventual sample isi, find all untreated observations for which their W is within a of W[i] (e.g. if a=.1 and the treated observation has W = 2, find the untreated observations with W >= 1.9 & W <= 2.1)Y across treatmentcut()) them firstlibrary(Ecdat)
data(Wages)
#Coarsen
Wages <- Wages %>% mutate(ed.coarse = cut(ed,breaks=3),
exp.coarse = cut(exp,breaks=3))
#Split up the treated and untreated
union <- Wages %>% filter(union=='yes')
nonunion <- Wages %>% filter(union=='no') %>%
#For every potential complete-match, let's get the average Y
group_by(ed.coarse,exp.coarse,bluecol,
ind,south,smsa,married,sex,black) %>%
summarize(untreated.lwage = mean(lwage))join, aka merging, is how you can link up two data sets when they match on a list of variables, i.e. “exact matches”!join (see help(join)). The one we want is inner_join() which only keeps successful matches, both treated and untreatedunion %>% inner_join(nonunion) %>%
summarize(union.mean = mean(lwage),nonunion.mean=mean(untreated.lwage))## union.mean nonunion.mean
## 1 6.687606 6.571178
#Original union and nonunion counts, and matched union count
c(sum(Wages$union=='yes'),sum(Wages$union=='no'),nrow(union %>% inner_join(nonunion)))## [1] 1516 2649 1274
atus package, from the American Time Use Survey. Load the atusresp and atusact data sets.atusact to tiercode==110101 (eating and drinking). Then inner_join it with atusresp. Call the result eating and ungroup() iteating <- na.omit(eating) to nuke missing datadur by hh_child, matching on everything else, using cut(,breaks=5) for everything that’s not a factor.library(atus)
data(atusresp)
data(atusact)
eating <- atusact %>% filter(tiercode==110101) %>% inner_join(atusresp) %>% ungroup() %>%
select(dur, hh_child, labor_status, student_status, work_hrs_week, partner_hh, weekly_earn, tuyear) %>%
na.omit() %>%
mutate(hrs.c = cut(work_hrs_week,breaks=5),earn.c = cut(weekly_earn,breaks=5),year.c = cut(tuyear,breaks=5))
kids <- filter(eating,hh_child=='yes')
nokids <- eating %>% filter(hh_child=='no') %>%
group_by(hrs.c,earn.c,year.c,labor_status,student_status,partner_hh) %>%
summarize(nokids.dur = mean(dur))
kids %>% inner_join(nokids) %>% summarize(kids.dur=mean(dur),nokids.dur=mean(nokids.dur))## # A tibble: 1 x 2
## kids.dur nokids.dur
## <dbl> <dbl>
## 1 68.0 71.9