# Lecture 19 Fixed Effects

## Recap

• Last time we talked about how controlling is a common way of blocking back doors to identify an effect
• We can control for a variable `W` by using our method of using `W` to explain our other variables, then take the residuals
• Another form of controlling is using a sample that has only observations with similar values of `W`
• Some variables you want to be careful NOT to control for - you don’t want to close front doors, or open back doors by controlling for colliders

## Today

• Today we’ll be starting on our path for the rest of the class, where we’ll be talking about standard methods for performing causal inference
• Different ways of getting identification once we have a diagram!
• Our goal here will be to understand these methods conceptually
• We won’t necessarily be doing best-statistical-practices for these. You’ll learn those in later classes, and best-practices change over time anyway
• Our goal is to understand these methods and be able to apply a straightforward version of them, not to publish a research paper

## Today

• In particular we’ll be talking about a method that is commonly used to identify causal effects, called fixed effects
• We’ll be discussing the kind of causal diagram that fixed effects can identify
• All of the methods we’ll be discussing are like this - they’ll only apply to particular diagrams
• And so knowing our diagrams will be key to knowing when to use a given method

## The Problem

• One problem we ran into last time is that we can’t really control for things if we can’t measure them
• And there are lots of things we can’t measure or don’t have data for!
• So what can we do?

## The Solution

• If we observe each person/firm/country multiple times, then we can forget about controlling for the actual back-door variable we’re interested in
• And just control for person/firm/country identity instead!
• This will control for EVERYTHING unique to that individual, whether we can measure it or not!

## In Practice

• Let’s do this on the data from the “gapminder” package
• This data tracks life expectancy and GDP per capita in many countries over time
``````library(gapminder)
data(gapminder)
cor(gapminder\$lifeExp,log(gapminder\$gdpPercap))``````
``## [1] 0.8076179``
``````gapminder <- gapminder %>% group_by(country) %>%
mutate(lifeExp.r = lifeExp - mean(lifeExp),
logGDP.r = log(gdpPercap) - mean(log(gdpPercap))) %>% ungroup()
cor(gapminder\$lifeExp.r,gapminder\$logGDP.r)``````
``## [1] 0.6404051``

## So What?

• This isn’t any different, mechanically, from any other time we’ve controlled for something
• So what’s different here?
• Let’s think about what we’re doing conceptually

## What’s the Diagram?

• Why are we controlling for things in this gapminder analysis?
• Because there are LOTS of things that might be back doors between GDP per capita and life expectancy
• War, disease, political institutions, trade relationships, health of the population, economic institutions…