# Lecture 24 Instrumental Variables

## Nick Huntington-Klein

### March 20, 2019

## Recap

- We’ve covered quite a few methods for isolating causal effects!
- Controlling for variables to close back doors (explain X and Y with the control, remove what’s explained)
- Matching on variables to close back doors (find treated and non-treated observations with )
- Using a control group to control for time (before/after difference for treated and untreated, then difference them)
- Using a cutoff to construct a very good control group (treated/untreated difference near a cutoff)

## Today

- We’ve got ONE LAST METHOD!
- Today we’ll be covering
*instrumental variables*
- The basic idea is that we have some variable - the instrumental variable - that causes
`X`

but has no other back doors!

## Natural Experiments

- This calls back to our idea of trying to mimic an experiment without having an experiment. In fact, let’s think about an actual randomized experiment.
- We have some random assignment
`R`

that determines your `X`

. So even though we have back doors between `X`

and `Y`

, we can identify `X -> Y`

## Natural Experiments

- The idea of instrumental variables is this:
- What if we can find a variable that can take the place of R in the diagram despite not actually being something we randomized in an experiment?
- If we can do that, we’ve clearly got a “natural experiment”
- When we find a variable that can do that, we call it an “instrument” or “instrumental variable”
- Let’s call it
`Z`

## Instrumental Variable

So, for `Z`

take the place of `R`

in the diagram, what do we need?

`Z`

must be related to `X`

(typically `Z -> X`

but not always)
- There must be
*no open paths* from `Z`

to `Y`

*except for ones that go through *`X`

In other words “`Z`

is related to `X`

, and all the effect of `Z`

on `Y`

goes THROUGH `X`

”

## Instrumental Variable

How?

- Explain
`X`

with `Z`

, and keep only what *is* explained, `X'`

- Explain
`Y`

with `Z`

, and keep only what *is* explained, `Y'`

- [If
`Z`

is logical/binary] Divide the difference in `Y'`

between `Z`

values by the difference in `X'`

between `Z`

values
- [If
`Z`

is not logical/binary] Get the correlation between `X'`

and `Y'`