Advanced Methods for Establishing Causal Inference

Chapter 8

© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education

Learning Objectives

Explain how instrumental variables can improve causal inference in regression analysis

Execute two-state least square regression

Judge which type of variables may be used as instrumental variables

Identify a difference-in-difference regression

Execute regression incorporating fixed effects

Distinguish the dummy variable approach from a within estimator for a fixed effect regression model

‹#›

© 2019 McGraw-Hill Education.

Instrumental variables

In the context of regression analysis, a variable that allows us to isolate the causal effect of a treatment on an outcome due to its correlation with the treatment and the lack of correlation with the outcome

Can improve causal inference in regression analysis

Instrumental Variables

‹#›

© 2019 McGraw-Hill Education.

A firm attempting to determine how its sales depend on price it charges for its product

Beginning with a simple data-generating process:

Salesi = α + β1Pricei + Ui

If local demand factor depends on local income, then local income is a confounding factor:

Salesi = α + β1Pricei + β2Incomei + Ui

Instrumental Variables: An Example

‹#›

© 2019 McGraw-Hill Education.

Including income in the model removes local income as confounding factor

Does its inclusion ensure that no other confounding factors still exist?

Many possibilities may come to mind, including local competition, market size, and market growth rate

Instrumental Variables: An Example

‹#›

© 2019 McGraw-Hill Education.

We may be unable to collect data on all confounding factors or find suitable proxies

Then we are unable to remove the endogeneity problem by including controls and/or proxy variables

A widely used method for measuring causality that can circumvent this problem involves instrumental variables

Instrumental Variables

‹#›

© 2019 McGraw-Hill Education.

Suppose we know price differences across some of the stores were solely due to differences in fuel costs

When two locations have different prices, we generally cannot attribute differences in sales to price differences, since these two locations likely differ in local competition

Rather than use all of the variation in price across the stores to measure the effect of price on sales, we focus on the subset of price movements due to variation in fuel costs

Instrumental Variables

‹#›

© 2019 McGraw-Hill Education.

WHEN TWO LOCATIONS HAVE DIFFERENT PRICES ONLY BECAUSE THEIR FUEL COSTS DIFFER, ANY DIFFERENCE IN SALES CAN BE ATTRIBUTED TO PRICE, SINCE FUEL COSTS DON’T IMPACT SALES PER SE

Instrumental Variables: An Example

‹#›

© 2019 McGraw-Hill Education.

Suppose we have the following data-generating function:

Yi = α + β1X1i + β2X2i + … + βKXKi + Ui

Variable Z is a valid instrument for Xi if Z is both exogenous and relevant, if:

Exogenous: It has no effect on the outcome variable beyond the combined effects of all variables in the determining function (X1…XK)

Relevant: For the assumed data-generating process, Z is relevant as an instrumental variable if it is correlated with X1 after controlling for X2….XK

Instrumental Variables

‹#›

© 2019 McGraw-Hill Education.

Two-stage least squares regression (2SLS) is the process of using two regressions to measure the causal effect of a variable while utilizing an instrumental variable

The first stage of 2SLS determines the subset of variation in Price that can attributed to changes in fuel costs; we can call the variable that tracks this variation

The second stage determines how Sales change with the movements of

This means that if we see Sales correlate with , there is reason to interpret this co-movement as the causal effect of Price

Two-Stage Least Square Regression

‹#›

© 2019 McGraw-Hill Education.

For an assumed data-generating process:

Yi = α + β1X1i + β2X2i + … + βKXKi + Ui

Suppose X1 is endogenous and Z is a valid instrument for X1. We execute 2SLS, in the first stage we assume:

X1i = γ + δ1Zi + δ2X2i + … + δKXKi + Vi

Then regress X1 on Z, X2…,XK and calculate predicted values for X1, defined as:

= + 1Z + 2X2 + … + XK

Two-Stage Least Square Regression

‹#›

© 2019 McGraw-Hill Education.

In the second stage, regress Y on , X2, …, XK

From the second stage regression, the estimated coefficient for is a consistent estimate for β1 (the causal effect of X1 on Y) and the estimated coefficient on X2 is a consistent estimate for β2

Run two consecutive regressions using the predictions from the first as an independent variable in the second

Statistical software combines this process into a single command

Two-Stage Least Square Regression

‹#›

© 2019 McGraw-Hill Education.

2SLS Estimates for Y Regressed on X1, X2, and X3

‹#›

© 2019 McGraw-Hill Education.

Summary of 2SLS where we have J endogenous variables and L J instrumental variables

Yi = α + β1X1i + β2X2i + … + βKXKi + Ui

Suppose X1, …, XJ are endogenous and Z1, …, ZL are valid instruments for X1, …, XJ

Execution of 2SLS proceeds as follows:

Two-Stage Least Square Regression

‹#›

© 2019 McGraw-Hill Education.

Two-Stage Least Square Regression

Regress X1, …, XJ on Z1, …, ZK , XJ+1 , … XK in J separate regressions

Obtain predicted values , …, using the corresponding estimated regression equations in Step 1. This concludes “Stage 1”

Regress Y on , …, , XJ+1 , … XK , which yields consistent estimates for α, β1, …, βK. This is “Stage 2”

‹#›

© 2019 McGraw-Hill Education.

An instrumental variable must be exogenous and relevant, and if so, we can use 2SLS to get consistent estimates for the parameters of the determining function

Can we assess whether the instrumental variable possesses these two characteristics?

Evaluating Instruments

‹#›

© 2019 McGraw-Hill Education.

An instrumental variable is exogenous if it is uncorrelated with unobservables affecting the dependent variable

For a data-generating process Yi = α + β1X1i + … + βKXKi + Ui , an instrumental variable Z must have Corr(Z, U) = 0

To prove this, regress Y on X1,…..XK, and calculate the residuals as: ei = Yi – ‒ X1i ‒ … ‒ XKi

We could then calculate the sample correlation between Z and the residuals, believing this to be an estimate for the correlation between Z and U

Exogeneity

‹#›

© 2019 McGraw-Hill Education.

The problem is that the residuals were calculated using a regression with an endogenous variable

Our parameter estimates are not consistent, meaning the sample correlation between Z and the residuals generally is not an estimator for the correlation between Z and U

If the number of instrumental variables is equal to the number of endogenous variables, there is no way to test for exogeneity

If the number of instrumental variables is greater than the number of endogenous variables, there are tests that can be performed to find evidence that at least some instrumental variables are not exogenous, but there is no way to test that all are exogenous

Exogeneity

‹#›

© 2019 McGraw-Hill Education.

Testing for relevance is simple and can be added when conducting 2SLS

For a data-generating process: Yi = α + β1X1i + … + βKXKi + Ui where X1 is endogenous, Z is relevant if it is correlated with X1 after controlling for X1, …, XK

We can assess whether this is true by regressing X1 on Z, X2…,XK

Relevance

‹#›

© 2019 McGraw-Hill Education.

Regression Output for Price Regressed on Income and Fuel Costs

‹#›

© 2019 McGraw-Hill Education.

It is important to establish convincing evidence that an instrumental variable(s) is relevant

Doing so avoids common criticism of instrumental variables centered on the usage of weak instruments

A weak instrument is an instrumental variable that has little partial correlation with the endogenous variable whose causal effect on an outcome it is meant to measure

Relevance

‹#›

© 2019 McGraw-Hill Education.

Regression Results for X1 Regressed on X2, X3,Z1, and Z2

‹#›

© 2019 McGraw-Hill Education.

Regression Results for Y Regressed on , X2, and X3

‹#›

© 2019 McGraw-Hill Education.

Classical Applications of Instrumental Variables for Business

Cost variables are popular choices as instrumental variables, particularly in demand estimations

Any variable that affects the costs of producing the good or service (input prices, cost per unit, etc.) can be to be a valid instrument for Price

Prices charged typically depend on costs

Cost variables are often both relevant and exogenous when used to instrument for Price in a demand equation

‹#›

© 2019 McGraw-Hill Education.

Classical Applications of Instrumental Variables for Business

Policy change is another popular choice as an instrumental variable

Local sales tax and/or price regulations can serve as instrumental variables for Price in a demand equation

Labor laws can serve as instrumental variables for wages when seeking to measure the effect of wages on productivity

Policy changes often affect business decisions (making them relevant) but often occur for reasons not related to business outcomes (exogenous)

‹#›

© 2019 McGraw-Hill Education.

With panel data we are able to observe the same cross-sectional unit multiple times at different points in time

Difference-in- difference regression

Fixed-effects model

Dummy variable estimation

Within estimation

Panel Data Method

‹#›

© 2019 McGraw-Hill Education.

Consider an individual who owns a large number of liquor stores in the states of Indiana and Michigan

Suppose Indiana state government decides to increase the sales tax on liquor sales by 3%

The owner may want to know the effect of this tax increase on her profit

Difference-in-Differences

‹#›

© 2019 McGraw-Hill Education.

To learn the effect of tax increase on the profit, the store owner collects data for two years as shown below:

Difference-in-Differences

‹#›

© 2019 McGraw-Hill Education.

To assess the effect of a tax hike on profit, the store owner may assume the following data-generating process:

Profitsit = α + βTaxHikeit + Uit

Profitsit is the profit of store i during Year t, and TaxHikeit equals 1 if the 3% tax hike was in place for store i during Year t and 0 otherwise

We could regress Profits on TaxHike, but difficult to argue that TaxHike is not endogenous

TaxHike equals 1 for a specific group of stores at a specific time; this method of administering the treatment may be correlated with unobserved factors affecting Profits

Difference-in-Differences

‹#›

© 2019 McGraw-Hill Education.

Control for a cross-sectional group (g = Indiana, Michigan) and for time (t = 2016, 2017)

Assume the following model:

Profitsigt = α + β1Indianag + β2Yeart + β3TaxHikegt Uigt

The data-generating process can also be written as:

Profitsigt = α + β1Indianag + β2Yeart + β3Indianag × Yeart + Uigt

Difference-in-Differences

‹#›

© 2019 McGraw-Hill Education.

β3 is the diff-in-diff for profits in this example

Difference in profits between 2017 and 2016 for Indiana:

α + β1 + β2 + β3 + Uigt ‒ (α + β1 + Uigt)= β2 + β3

Difference in profits between 2017 and 2016 for Michigan:

α + β2 + Uigt ‒ (α + Uigt)= β2

Take the difference between the change in profits in Indiana and Michigan to get the diff-in-diff:

β2 + β3 ‒ β2 = β3

Difference-in-Differences

‹#›

© 2019 McGraw-Hill Education.

Difference-in-Differences for Liquor Profits in Indiana and Michigan

‹#›

© 2019 McGraw-Hill Education.

Difference-in-Differences

Difference-indifferences (diff-in-diff) is the difference in the temporal change for the outcome between the treated and untreated group

Diff-in-diff highly effective and applies for dichotomous treatments spanning two periods

‹#›

© 2019 McGraw-Hill Education.

Fixed effects model is a data-generating process for panel data that includes controls for cross-sectional groups

The controls for cross-sectional groups are call fixed effects

For a data-generating process to be characterized as a fixed effects model, it need have only controls for the cross-sectional groups

Can control for time periods by including time trends

Outcomeigt = α+ δ2Group2g + … + δGGroupGg + γTimet + βTreatmentgt+ Uigt

The Fixed-Effects Model

‹#›

© 2019 McGraw-Hill Education.

The Fixed-Effects Model

By controlling for the groups and periods, many possible confounding factors in the data-generating process are eliminated

Can add controls (Xigt’s) beyond the fixed effects and time dummies to help eliminate some of the remaining confounding factors

Two ways of estimating the fixed-effects model include: dummy variable estimation and within estimation

‹#›

© 2019 McGraw-Hill Education.

Dummy variable estimation uses regression analysis to estimate all of the parameters in the fixed effects data-generating process

Regress the Outcome on dummy variables for each cross-sectional group (except the base unit), dummy variables for each period (except the base period), and the treatment

The Fixed-Effects Model: Dummy Variable Estimation

‹#›

© 2019 McGraw-Hill Education.

Subset of Dummy Variable Estimation Results for Sales Regressed on Tax Rate

‹#›

© 2019 McGraw-Hill Education.

The Fixed-Effects Model: Dummy Variable Estimation

Interpreting the table from the previous slide:

Each state coefficient measures the effect on a store’s profits of moving the store from the base state (State 1) to that alternative state, for a given year and tax rate

Each year coefficient measures the effect on a store’s profits of moving the store from the base year (Year 1) to that alternative year, for a given state and tax rate

The coefficient on Tax Rate measures the effect on a store’s profits of changing the Tax Rate, for a given state and year

‹#›

© 2019 McGraw-Hill Education.

The Fixed-Effects Model: Within Estimation

Within estimation uses regression analysis of within-group differences in variables to estimate the parameters in the fixed effects data-generating process, except for those corresponding to the fixed effects (and the constant)

Eliminates the need to estimate the coefficient for each fixed effect

‹#›

© 2019 McGraw-Hill Education.

The Fixed-Effects Model: Within Estimation

Outcomeigt = α+ δ2Group2g +…+ δGGroupGg + γTimet + Treatmentgt+ Uigt

We estimate the parameters γ2, …, γT, β via within estimation:

Determine the cross-sectional groups and calculate group-level means: = and =

Create new variables: Outcome*igt = Outcomeigt ‒ , Treatment*igt = Treatmentgt ‒

Regress Outcome* on Treatment* and the Period dummy variables

‹#›

© 2019 McGraw-Hill Education.

Comparing Estimation Methods

Dummy variable estimation provides estimates for the fixed effects (the effects of switching groups on the outcome), whereas within estimation does not

For dummy variable estimation R-squared is often misleadingly high, suggesting a very strong fit

For within estimation, R-squared is more indicative that the variation in Treatment is explaining variation in the Outcome

Both estimation models eliminate confounding factors that are fixed across periods for the groups or are fixed across groups over time

Both estimation models could yield inaccurate estimates if there are unobserved factors that vary within a group over time

‹#›

© 2019 McGraw-Hill Education.