News and Events

“The Future of Longitudinal Studies:
What we know; What we don’t know; What we need to know”

Inferring Causality from Longitudinal Studies
Chaired By: Elizabeth Owens, University Of California, Berkeley
Friday, March 21, 2003

David A. Freedman
"Using Regression Models to Infer Causation"
University of California, Berkeley

This talk focused on how causation can be inferred from modeling observational data only under very specific circumstances—namely, if the modeling assumptions are valid. Types of modeling assumptions include which variables are entered into the regression equation, how error terms and latent variables are dealt with, and how causal relationships and invariance are understood. Statistical models are just models and it is important to know the hidden assumptions behind any statistical model. However, since many researchers don’t take the time to do this, inferences from scientific studies are often conditional—based on unexamined (and often invalid) assumptions. While other strategies for inferring causality are apparently cruder, they are often more successful than the “fancier” models often used in the psychological sciences. A number of examples of epidemiological studies serve as examples: John Snow’s discovery of the spread of cholera and how smoking causes lung cancer (for more examples, see http://www.stat.berkeley.edu/~census/521.pdf).

To illustrate our often blind reliance on fancy-sounding regression models, Freedman presented an example of a study (Pargament et al., 2001) that inferred that negative religious feelings significantly increased risk of death among seriously ill, hospitalized patients by 6%. The study used a log linear statistical model (the Cox model) that supposedly adjusted for study limitations (e.g., large amounts of missing data and late entry into study); however, the Cox model made certain assumptions (e.g., effects are multiplicative and constant across time and people) that may not have been appropriate and made no adequate adjustments for potentially important factors such as income or degree of ill health. Both the measures and the modeling assumptions in this study were questionable, leaving the “findings” spurious, at best.

At the end of the talk, practical suggestions were offered regarding some ways to deal with these often overlooked methodological issues. In any study, it is crucial to 1) archive data, equations and programs, 2) list assumptions (regarding causation, latent variables, etc.) and indicate which have and haven’t been checked and how, and 3) never omit any hidden assumptions or indicate they’ve been checked when they haven’t.

 


News and Events Research Community Involvement
Graduate Studies Child Study Center Contact Us
Home Page Directions Welcome to UC Berkeley