Revised: March 15, 2010 

School of Public Health

University of California, Berkeley

Theory-Based Data Analysis in Public Health Research

PB HLTH 298-64 (Group Study), Spring, 2010

 

Instructor:                   Norm Constantine

237 University Hall

(925) 284-8118

nconstantine@berkeley.edu

Units:                          2-4 units depending on projects, by arrangement

When offered:            Tuesdays, 4:00 to 6:00 p.m.

Location:                    590-L University Hall

CC#                              76679

Participants:              Developed for DrPH students, other doctoral and advanced master’s students are welcome by special arrangement as space permits (max=8)

 

Course Description

 

This seminar is intended to assist students in (1) developing their abilities to critical appraise the design and interpretation of multivariable analyses of non-experimental quantitative data, and (2) translating the theory guiding their own dissertation research into a coherent plan for analysis.

 

Five types of analytical strategies for multivariable analysis will be reviewed and critiqued. These include three atheoretical strategies – bivariate (unadjusted), simultaneous (standard), and automated (stepwise, perhaps better classified as an anti-theory strategy), as well as two theory-based strategies – sequential (hierarchical), and elaborative (exclusionary/inclusive). The focus of this seminar will be on these last two theory-based strategies, and on conceptual design and causal inference issues, rather than statistical analysis issues.

 

The group will meet for weekly three-hour sessions. Each session will include (1) discussion of concepts and issues from the current week’s readings, and (2) discussion and critique of students’ own research and/or student presented published examples from their own public health specialty areas. Both elements are priorities of this course, and as much as possible, the two will be integrated.

 

The etiology of this course is similar to that described by sociologist and UCLA School of Public Health Professor Carol Aneshensel in her preface to our primary text. We will build on her and her students experiences and insights to achieve some of the same successes:

 

“This text was conceived at a student's preliminary doctoral exam, and its development has been a response to questions posed by students in the graduate course that eventually grew from this brainchild. I was struck, not for the first time, by a technically correct description of a series of statistical techniques that had little relevance to the theory to be tested in the proposed research. Lest students think they have been unfairly singled out, I should men­tion that this disconnection is also found in the proposals of more seasoned investigators. I was puzzled because the students who struggled most with their data analysis were uniformly bright, articulate in their understanding of theory, and, most perplexingly, well trained in the techniques of multivariate statistics. This pat­tern suggested that something was missing from the wav we were training the next generation of social scientists.  

 

“With considerable chutzpah, I started a seminar on multivari­ate data analysis to rectify this problem. I soon discovered that I had no idea how to teach someone else how to translate his or her theory into a coherent plan for data analysis. Fortunately students in those first few years did not seem to realize this chaotic state of confusion and found that the seminar enabled them to integrate what they had learned over several years in other courses. I was pleased, of course, and more than ready to accept the credit for their astute insights. In retrospect, it is clear students were able to bring their ideas and their analyses closer together because they presented their analyses to their classmates, responded to their critiques, and offered the same in exchange. This book is the result of eavesdropping on their conversations.”

-- Aneshensel (2002)

 

Prerequisites

 

This is a conceptual applications seminar and is not focused on statistical mechanics. To benefit from the seminar it is necessary to have sufficient understanding of the basic concepts of multiple linear regression, such as variance and covariance, R2, b and Beta coefficients, partial and semi-partial correlation, statistical control, regression assumptions, etc. Prior completion of one of the following: PH241 Statistical Analysis of Categorical Data; PH241 Multivariate Statistic; Educ 275B Data Analysis in Educational Research II; or other equivalent course that sufficiently covers the fundamentals of multiple linear and/or logistic regression is expected.

 

It is also expected that students have developed dissertation or other research questions and at least begun work on theoretical frameworks to be able to apply these to the exercises of this class.

 

Learning Objectives  

 

After successful completion of this course, students will be able to:

 

1. Explain the critical role of theory in public health research, and the fundamental differences between predictive and explanatory research;

2. Explain and illustrate the differences between the statistical problem of multicollinearity and interpretative challenge of overlapping covariance;

3. Distinguish between different types of multivariable analytic strategies, and explain the advantages, disadvantages, and appropriate applications for each;

4. Explain and distinguish between confounding, suppression, mediation, and moderation.

5. Discuss the underdetermination of theory by available evidence and the Duhem–Quine thesis as they relate to causal inference questions supported by multivariable analyses;

6. Critically appraise the design, reporting, and interpretation of published multivariable analyses in the student’s specialty area of public health; and

7. Develop and implement a coherent theory-based design, analysis, and report for a theory-based multivariable analysis in the student’s specialty area of public health.

 

Student Responsibilities

 

1. Thoroughly study all assigned readings prior to class, and actively and appropriately participate in all discussions;

2. Identify, present, and critically discuss published examples from students’ specialty area of public health, and contribute to the discussion of examples from other students’ areas; 

3. Develop a theory-based multivariable analysis design based on an appropriate application of a theory-based strategy, for a research question(s) from the student’s own specialty area; and

4. Implement the above theory-based multivariable analysis design using a relevant data set and appropriate statistical software. Provide a written report and verbal presentation to the seminar.

 

Required Text

Aneshensel, C.S. (2002). Theory-based data analysis for the social sciences. Thousand Oaks, CA: Sage, Pine Forge Press.

 

Recommended Texts

Introductory review: Allison, P. D. (1999). Multiple regression: A primer. Thousand Oaks, CA: Sage, Pine Forge Press. (an excellent review of the mechanics and assumptions of multiple regression analysis, from the same series as the Aneshensel book. However Allison’s approach is not theory based, and in some places it is downright theoretically confused. Yet, with that caveat in mind, I do otherwise recommend this clearly written, accessible, and engaging book.)

Comprehensive reference: Tabachnick, B.G. & Fidell, L.S. (2007) Using multivariate statistics (especially Chapter 5: Multiple regression; Chapter 10: Logistic regression). Boston, MA: Allyn and Bacon.

 

Other Readings (In course outline below)

 

Course Outline

Week 1

·       Introduction and course overview

·       Issues in confirmation bias

·       Confirmation, refutation, and corroboration

·       Prove versus support or discredit

·       Discussion and critique of students’ research interests and plans

Week 2

·       Prediction vs. explanation

·       Review of the nature and importance of  theory

·       Brief review of basic concepts in multiple regression analysis

·       Multicollinearity vs. overlapping covariance

·       Discussion and critique of students’ research interests and plans

 

Readings

 

1.     Phillips, D.C. (2000). The expanded social scientist’s bestiary: A guide to fabled threats to, and defenses of, naturalistic social science. (Preface, and Chapter 6: New philosophy of science). View

 

2.     Hughes, J. N. (2000). The essential role of theory in the science of treating children: Beyond empirically supported treatments. Journal of School Psychology, 38, 301–330.

 

3.     Pedhazur, E. J.  (1997). Prediction and explanation (pp. 195-198, 211). In Multiple regression in behavioral research: Prediction and explanation.  View

 

4.     Pedhazur, E. J. & Schmelkin, L. P.  (1991). Multiple regression analysis (and cautions on variance partitioning) (pp. 413-428). In Measurement, design, and analysis: An integrated approach.  View

Week 3

·       Review of assumptions in linear regression

·       Review of assumptions in logistic regression

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research plans

 

Readings  

 

1.     Tabachnick, B. G. & Fidell, L. S. (2007). Using multivariate statistics (5th edition). New York: Allyn and Bacon.  (Assumptions, pp. xxx-xxx )

 

2.     Lumley, T. (2002). The importance of the normality assumption in large public health data sets. Annual Review of Public Health, 23:151-169.

Week 4

·       Five analytic strategies in multiple regression

·       Victora’s generic hierarchical conceptual framework

·       Critique and discussion of parent-adolescent communication example

·       Discussion and critique of students’ research plans

 

Readings

 

1.     Tabachnick, B. G. & Fidell, L. S. (2007). Using multivariate statistics (5th edition). New York: Allyn and Bacon.  (Major types of multiple regression, pp. 136-144.) View

 

2.     Victora, et al. (1997). The role of conceptual frameworks in epidemiological analysis: A hierarchical approach. International Journal of Epidemiology, 26(1). View

 

3.     Phillips, D.C. (2000). Chapter 8 (Popperian rules for research design). View 

 

Week 5

·       Clues to the puzzle of scientific evidence

·       Theories and laws

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research plans

 

Readings

 

1.     Haack, S. (2003). Defending science -- Within reason: Between scientism and cynicism. New York: Prometheus.  Preface, and Chapter 3 (Clues to the puzzle of scientific evidence: A more so  story). View 

 

2.     Phillips, D.C. (2000). Chapter 12 (Theories and Laws). View

Week 6

·       Rosenberg’s elaboration model

·       Confounding and suppression

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research plans

 

Readings

 

1.     Babbie, E. (2001), The practice of social research (Chapter 16: The elaboration model). Belmont, CA: Wadsworth.

 

2.     Constantine, N. A. (2008). Simpson’s paradox. In S. Boslaugh (Ed.), Encyclopedia of epidemiology: Vol. 1 (pp. 973-974). Thousand Oaks, CA: Sage Publishers.

 

3.     Greenland, S. and Morgenstern, H. (2001). Confounding in health research. Annual Review of Public Health, 22:189–212

Week 7

·       Aneshensel’s elaboration model: exclusionary/inclusive approach

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Aneshensel (2002), Preface

 

2.     Aneshensel (2002), Chapter 1: Introduction to Theory-Based Data Analysis

 

3.     Aneshensel (2002), Chapter 2: The Logic of Theory-Based Data Analysis

 

Week 8

·       Associations and relationships

·       The focal relationship

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Aneshensel (2002): Chapter 3: Associations and Relationships

 

2.     Aneshensel (2002): Chapter 4: The Focal Relationship: Demonstrating Internal Validity

Week 9

·       Ruling out alternative explanations

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Donald Campbell on ruling out alternative explanations

 

2.     Aneshensel (2002): Chapter 5: Ruling Out  Alternative Explanations:   Spurious and Control Variables

 

3.     Aneshensel (2002): Chapter 6: Ruling Out  Alternative Explanations: Additional Independent Variables

 

Week 10

·       Elaborating an explanation

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Aneshensel (2002): Chapter 7: Elaborating an Explanation: Antecedent, Intervening, and Consequent Variables

 

2.     Baron, R.M. & Kenny, D.A.  (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.

 

3.     Frazier, P.A., Tix, A.P., & Barron, K.E. (2004). Testing moderator and mediator effects in counseling psychology research. Journal of Counseling Psychology, 51, 115-134. View

Week 11

·       Conditions of  Influence

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Aneshensel (2002): Chapter 8: Conditions of  Influence: Effect Modification

 

2.     Frazier, P.A., Tix, A.P., & Barron, K.E. (2004). Testing moderator and mediator effects in counseling psychology research. Journal of Counseling Psychology, 51, 115-134. View

Week 12

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

Week 13

·       Review, contrast, and critiques of the sequential and elaboration strategies

·       The lingering problem of theory under-determination in both strategies

·       Group critique of student selected published examples (readings TBD)

·       Discussion and critique of students’ research designs and analyses

 

Readings

 

1.     Aneshensel (2002): Chapter 9: Synthesis and Comment

Week 14

·       Four student presentations and class critique and discussion (10 minute presentation, 20 minute discussion each)

Week 15

·       Four student presentations and class critique and discussion (10 minute presentation, 20 minute discussion each)

 

 

 

 


Words of Wisdom (instructor’s known biases)

 

·       "You cannot fix by analysis what you bungle in design." (Light, Singer, & Willet, 1990)

·       “Occam’s razor applies to methods as well as to theories.” (Wilkinson & the APA Task Force on Statistical Inference, 1999)

·       Analytic techniques, no matter how fancy they may be, cannot salvage a misspecified model.” (Pedhazur & Schmelkin, 1991)

·       "With the data usually available for such studies, there is simply no logical or statistical procedure that can be counted on to make proper allowances for uncontrolled preexisting differences between groups. (Lord, 1967, quoted in Pedhazur & Schmelkin, 1991)

·       "One may well wonder what exactly it means to ask what the data would look like were they not what they are." (Anderson, 1963)

·       “A willingness to entertain rival interpretations, an ability to place knowledge within broader contexts, and an openness to new ways of conceptualizing problems are essential to scientific inquiry. Theory serves these functions as well as directs inquiry, unifies and systematizes knowledge, and makes sense of what (might) otherwise be inscrutable empirical facts” (Hughes, 2000)

·        “The model does not ‘confirm’ causal relationships. Rather it assumes causal links and then tests how strong they would be if the model were a correct representation of reality.” (Shadish, Cook, & Campbell 2002).

·       “Structural equation modeling is more useful for rejecting false models than for somehow proving whether a given model is in fact true.” (Kline, 1998).

·       “More and more I have come to the conclusion that the core of the scien­tific method is not experimentation per se but rather the strategy connoted by the phrase ‘plausible rival hypotheses’." (Campbell, 1989)

 

Words of Wisdom References

 

Campbell, D.T. (1989). Foreword to R. K. Yin, (2003) Case study research design and methods (3rd ed.). Thousand Oaks, CA: Sage. (first edition 1989).

Hughes, J. N. (2000). The essential role of theory in the science of treating children: Beyond empirically supported treatments. Journal of School Psychology, 38, 301–330.

Kline, R. B. (1998). Principles and practice of structural equation modeling. NY: Guilford.

 

Light, R. J., Singer, J. D., & Willett, J. B. (1990). By design: Planning research on higher education. Cambridge, MA: Harvard University Press.

Pedhazur, E. J. & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. .

Shadish, W. R., Jr., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton-Mifflin.

 

Wilkinson, L. & the APA Task Force on Statistical Inference (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604