Lecture 16 Back Doors

Nick Huntington-Klein

March 3, 2019


  • We’ve now covered how to create causal diagrams
  • (aka Directed Acyclic Graphs, if you’re curious what “dag”itty means)
  • We simply write out the list of the important variables, and draw causal arrows indicating what causes what
  • This allows us to figure out what we need to do to identify our effect of interest


  • But HOW? How does it know?
  • Today we’ll be covering the process that lets you figure out whether you can identify your effect of interest, and how
  • It turns out, once we have our diagram, to be pretty straightforward
  • So easy a computer can do it!

The Back Door and the Front Door

  • The basic way we’re going to be thinking about this is with a metaphor
  • When you do data analysis, it’s like observing that someone left their house for the day
  • When you do causal inference, it’s like asking how they left their house
  • You want to make sure that they came out the front door, and not out the back door, not out the window, not out the chimney

The Back Door and the Front Door

  • Let’s go back to this example

The Back Door and the Front Door

  • We’re interested in the effect of IP spend on profits. That means that our front door is the ways in which IP spend causally affects profits
  • Our back door is any other thing that might drive a correlation between the two - the way that tech affects both


  • In order to formalize this a little more, we need to think about the various paths
  • We observe that you got out of your house, but we want to know the paths you might have walked to get there
  • So, what are the paths we can walk to get from IP.spend to profits?


  • We can go Ip.spend -> profit
  • Or IP.spend <- tech -> profit