Lecture 8 Fixed Effects

Nick Huntington-Klein

2021-01-05

Recap

  • Last time we talked about how controlling is a common way of blocking back doors to identify an effect
  • We can control for a variable W by using our method of using W to explain our other variables, then take the residuals
  • Another form of controlling is using a sample that has only observations with similar values of W
  • Some variables you want to be careful NOT to control for - you don’t want to close front doors, or open back doors by controlling for colliders

Today

  • Today we’ll be starting on our path for the rest of the class, where we’ll be talking about standard methods for performing causal inference
  • Different ways of getting identification once we have a diagram!
  • Our goal here will be to understand these methods conceptually and to also figure out some good statistical practices for their use
  • Our goal is to understand these methods and be able to apply a straightforward version of them

Today

  • In particular we’ll be talking about a method that is commonly used to identify causal effects, called fixed effects
  • We’ll be discussing the kind of causal diagram that fixed effects can identify
  • All of the methods we’ll be discussing are like this - they’ll only apply to particular diagrams
  • And so knowing our diagrams will be key to knowing when to use a given method

The Problem

  • One problem we ran into last time is that we can’t really control for things if we can’t measure them
  • And there are lots of things we can’t measure or don’t have data for!
  • So what can we do?

The Solution

  • If we observe each person/firm/country multiple times, then we can forget about controlling for the actual back-door variable we’re interested in
  • And just control for person/firm/country identity instead!
  • This will control for EVERYTHING unique to that individual, whether we can measure it or not!

In Practice

  • Let’s do this on the data from the “gapminder” package
  • This data tracks life expectancy and GDP per capita in many countries over time
library(gapminder)
data(gapminder)
cor(gapminder$lifeExp,log(gapminder$gdpPercap))
## [1] 0.8076179
gapminder <- gapminder %>% group_by(country) %>%
  mutate(lifeExp.r = lifeExp - mean(lifeExp),
         logGDP.r = log(gdpPercap) - mean(log(gdpPercap))) %>% ungroup()
cor(gapminder$lifeExp.r,gapminder$logGDP.r)
## [1] 0.6404051

So What?

  • This isn’t any different, mechanically, from any other time we’ve controlled for something
  • So what’s different here?
  • Let’s think about what we’re doing conceptually

What’s the Diagram?

  • Why are we controlling for things in this gapminder analysis?
  • Because there are LOTS of things that might be back doors between GDP per capita and life expectancy
  • War, disease, political institutions, trade relationships, health of the population, economic institutions…

What’s the Diagram?

What’s the Diagram?

  • There’s no way we can identify this
  • The list of back doors is very long
  • And likely includes some things we can’t measure!

What’s the Diagram?

  • HOWEVER! If we think that these things are likely to be constant within country…
  • Then we don’t really have a big long list of back doors, we just have one: “country”

What We Get

  • So what we get out of this is that we can identify our effect even if some of our back doors include variables that we can’t actually measure
  • When we do this, we’re basically comparing countries to themselves at different time periods!
  • Pretty good way to do an apples-to-apples comparison!

Graphically

Graphically

  • The post-fixed-effects dots are basically a bunch of “Raw Country X” pasted together.
  • Imagine taking “Raw Pakistan” and moving it to the center, then taking “Raw Britain” and moving it to the center, etc.
  • Ignoring the baseline differences between Pakistan, Britain, China, etc., in their GDP per capita and life expectancy, and just looking within each country.
  • We are ignoring all differences between countries (since that way back doors lie!) and looking only at differences within countries.
  • Fixed Effects is sometimes also referred to as the “within” estimator

In Action

Notably

  • This does assume, of course, that all those back door variables CAN be described by country
  • In other words, that these back doors operate by things that are fixed within country
  • If something is a back door and changes over time in that country, fixed effects won’t help!

Varying Over Time

  • For example, earlier we mentioned war… that’s not fixed within country! A given country is at war sometimes and not other times.

Varying Over Time

  • Of course, in this case, we could control for War as well and be good!
  • Time-varying things doesn’t mean that fixed effects doesn’t work, it just means you need to control for that stuff too
  • It always comes down to thinking carefully about your diagram
  • Fixed effects mainly works as a convenient way of combining together lots of different constant-within-country back doors into something that lets us identify the model even if we can’t measure them all

Fixed Effects in Regression

  • We can just do fixed effects as we did-subtract out the group means and analyze (perhaps with regression) what’s left
  • We can also include dummy variables for each group/individual, which accomplishes the same thing

\[ Y = \beta_0 + \beta_1Group1 + \beta_2Group2 + ... + \]

\[ \beta_NGroupN + \beta_{N+1}X + \varepsilon \]

\[ Y = \beta_i + \beta_1X + \varepsilon \]

Fixed Effects in Regression

  • Why does that work?
  • We want to “control for group/individual” right? So… just… put in a control for group/individual
  • Of course, like all categorical variables as predictors, we leave out a reference group
  • But here, unlike with, say, a binary predictor, we’re rarely interested in the FE coefficients themselves. Most software works with the mean-subtraction approach (or a variant) and don’t even report them!

Fixed Effects in Regression: Variation

  • Remember we are isolating within variation
  • If an individual has no within variation, say their treatment never changes, they basically get washed out entirely!
  • A fixed-effects regression wouldn’t represent them. And can’t use FE to study things that are fixed over time
  • And in general if there’s not a lot of within variation, FE is going to be very noisy. Make sure there’s variation to study!

Fixed Effects in Regression: Notes

  • It’s common to cluster standard errors at the level of the fixed effects, since it seems likely that errors would be correlated over time (autocorrelated errors)
  • It’s possible to have more than one set of fixed effects. \(Y = \beta_i + \beta_j + \beta_1X + \varepsilon\)
  • But interpretation gets tricky - think through what variation in \(X\) you’re looking at at that point!

Coding up Fixed Effects

  • We will use the fixest package
  • It’s very fast, and can be easily adjusted to do FE with other regression methods like logit, or combined with instrumental variables
  • Clusters at the first listed fixed effect by default
library(fixest)

m1 <- feols(outcome ~ predictors | FEs, data = data)
msummary(m1)

Example: Sentencing

  • What effect do sentencing reforms have on crime?
  • One purpose of punishment for crime is to deter crime
  • If sentences are more clear and less risky, that may reduce a deterrent to crime and so increase crime
  • Marvell & Moody study this using data on reforms in US states from 1969-1989

Example: Sentencing

  • I’ve omitted code reading in the data
  • But in our data we have multiple observations per state
head(mmdata)
## # A tibble: 6 x 6
##   state   year assault robbery pop1000 sentreform
##   <chr>  <dbl>   <dbl>   <dbl>   <dbl>      <dbl>
## 1 "ALA "    70    7413    1731    3450          0
## 2 "ALA "    71    7645    2005    3497          0
## 3 "ALA "    72    7431    2407    3540          0
## 4 "ALA "    73    8362    2809    3581          0
## 5 "ALA "    74    8429    3562    3628          0
## 6 "ALA "    75    8440    4446    3681          0
mmdata <- mmdata %>% mutate(assaultper1000 = assault/pop1000,
         robberyper1000 = robbery/pop1000)

Fixed Effects

  • We can see how robbery rates evolve in each state over time as states implement reform

Fixed Effects

  • You can tell that states are more or less likely to implement reform in a way that’s correlated with the level of robbery they already had
  • So SOMETHING about the state is driving both the level of robberies AND the decision to implement reform
  • Who knows what!
  • Our diagram has reform -> robberies and reform <- state -> robberies, which is something we can address with fixed effects.

Fixed Effects

sentencing_ols <- lm(robberyper1000 ~ sentreform, data = mmdata)
sentencing_fe <- feols(robberyper1000 ~ sentreform | state, data = mmdata)
msummary(list('OLS' = sentencing_ols, 'FE' = sentencing_fe), stars = TRUE, gof_omit = 'AIC|BIC|F|Lik|Adj|Pseudo')
OLS FE
(Intercept) 1.254***
(0.036)
sentreform 0.352*** 0.245***
(0.082) (0.076)
Num.Obs. 1000 1000
R2 0.018 0.919
R2 Within 0.062
Std. errors Clustered (state)
* p < 0.1, ** p < 0.05, *** p < 0.01

Example

  • The 1.254, 0.352 included the fact that different kinds of states tend to institute reform
  • The 0.245 doesn’t!
  • Looks like the deterrent effect was real! Although important to consider if there might be time-varying back doors too, we don’t account for those in our analysis
  • What things might change within state over time that would be related to robberies and to sentencing reform?

Practice

  • We want to know the effect of your teacher on the test scores of high school students
  • Some potential back doors might go through: parents' intelligence, age, demographics, school, last year's teacher
  • Draw a diagram including all these variables, plus maybe some unobservables where appropriate
  • If you used fixed effects for students, what back doors would still be open?
  • What would the feols() command for this regression look like?

Practice Answers

  • Fixed effects would close your back doors for parents' intelligence, demographics, and school, but leave open age and last year's teacher
m <- feols(TestScore ~ Teacher + Age + LastYearsTeacher | 
             Student, data = data)