Look inside OLI free courses

OLI upcoming events

2008 OLI Summer Workshops
July 7-11, 2008

TRACK 1 Faculty Course Use and Customization

TRACK 2 Developing Effective Online Courses using the OLI Tools and Processes

> apply now

Empirical Research Methods

> LOOK inside the free & open OLI Empirical Research Methods course
> Give us your feedback

OLI Empirical Research MethodsThe Empirical Research Methods course is currently under development and we are making the course available while it is under development.

Regression analysis is an enormously popular and powerful tool, used ubiquitously in the social and behavioral sciences. Most courses on the subject immediately dive into the mathematical aspects of the subject and illustrate the technique on problems that are already highly structured. As a result, most students come away with little idea of the wide range of problems to which regression analysis can be applied and how to represent those problems in a way that cleverly utilizes readily available data. Few understand, at a conceptual level, the limitations of regression analysis.

The OLI Empirical Research Methods course bridges the gap between the mathematical foundations of regression and its practical application. We teach students how to move from an interesting question about the world to a regression model that, when estimated, meaningfully addresses the question asked. It emphasizes causal analysis as the main research goal and multivariate linear regression as the main statistical tool. We teach a process that involves:

  1. Formulating a research problem,
  2. Developing and formalizing hypotheses,
  3. Collecting data relevant to these hypotheses,
  4. Analyzing the data using an appropriate regression model, and
  5. Critically interpreting the results of these analyses

A learner who successfully completes our course will be able to do much more than mechanically estimate a regression model with standard statistical software like SPSS or Minitab, or check whether coefficient estimates are “significant” at the .05 or .01 level. They will be able to bring to bear their own scientific imagination in order to use regression as a tool to investigate problems about the real world. They will be able, perhaps not with professional sophistication, but with competence, to do real empirical research.

Pre-requisites

We assume that learners entering Empirical Research Methods (ERM) have taken at least a semester or year-long course in statistics, and through this or some other experience have been exposed to the following concepts:

  • Random Variables
  • Population and Samples
  • Data Tables (rows=sample units and columns=variables)
  • Summary Statistics: Mean, Median, Variance, Covariance, Correlation
  • Graphs: Boxplots, Barcharts, Histograms, Scatterplots
  • Inference: standard errors, confidence intervals, hypothesis tests, etc.
  • Models: Bivariate Regression, perhaps ANOVA

If learners have not had such exposure, they can follow the appropriate links into the OLI Introductory Statistics course to review the required concepts.

Learning Objectives

Our primary goal is to teach learners to bring empirical data to bear on interesting questions by using regression analysis in a way that is scientifically credible. We begin by considering problems in which hypotheses have been formulated, the unit of analysis defined, and data located to construct variables to test the hypotheses. The next tasks are to determine how to construct variables consistent with the hypothesized relationships and that can be implemented with the data. We provide various examples that illustrate how to do this. Next we teach learners to relate these variables with a regression model, and then interpret the results of the regression estimation in a scientifically informed manner, both with respect to the inferences that can be made and the inferences that cannot.

Sequence of Instruction

Unit 1. Welcome and Course Overview
1.1 Preveiw of the Course

Unit 2: Regression, Prediction and Causation
2.1 Prediction
2.2 Causation
2.3 The Bivariate Regression Eqauation Interpreted Causally

Unit 3: The Prima Facie Case
3.1 Overview
3.2 Steps in making the Prima Facie Case

Teaching approach

We teach the topics above using the approach that has proven to be successful in our OLI statistics and causal reasoning courses. Many of the activities also use an extended version of StatTutor, the computer based statistics tutor that supports the OLI introductory statistics course and the Causality Lab, the virtual social science experiment lab environment that supports the OLI Causal and Statistical reasoning course. This similarity in structure also allows instructors to easily combine and sequence modules from the statistics course, the causal reasoning course and the empirical research methods course to tailor a course in this domain to fit the needs of their students.

Each of the modules above follows the format of:

  1. Locate the current topic in the big-picture – the stage of ERM process
  2. Situate task/concepts to be taught in an interesting case study
  3. Present tasks/concepts abstractly
  4. Present interactive exercises and tutors to support learning of tasks/concepts
  5. Extended problem solving episode in an interesting case study using Causality Lab and StatTutor

We use many case studies and data sets to illustrate various themes that arise in the application of regression methods to interesting problems. For example:

Education-Poverty

Question: Does insufficient education cause poverty?

Approach: From census data, we can define variables capturing the level of education in a district and the level of income. For example, we can compute the number of individuals in the district of a given age, say males of age 35, who did not graduate from high school, and we can compute the number of individuals in the district that live in a household with total income below the poverty line. Clearly, there are other things we could have measured about income and poverty, but our concern here is not about what to measure, but how to use the data once the measures have been selected. So the question we focus on in this case study is how to use the data to construct meaningful variables. One possibility is to relate the number of individuals in households below the poverty line to the number of males aged 35 that did not graduate from high school. Another is to relate the fraction of individuals in households below the poverty line to the fraction of males aged 35 that did not graduate from high school. Or the fraction of individuals in households below the poverty line could be related to the number of males aged 35 that did not graduate from high school, or the number of individuals in households below the poverty line could be related to the fraction of males aged 35 that did not graduate from high school. Which of these options provides reliable inferences about the question of interest? It turns out that only one of these options provides reliable inferences, and that is the one where both variables are expressed as a fraction. We explain why and illustrate the logic by showing how spurious inferences can arise using the other approaches. We also discuss other problems that have a similar feature to this one and identify what the feature is.

Liberty Ships

Question: Do organizations learn how to produce a product more efficiently as they gain experience producing the product? Organizational learning has always been a big issue regarding the production of goods used in national defense. For example, the Air Force buys planes from private firms like Boeing. The question is how much it should pay for the planes. If the planes are new and have never been produced before, an estimate can be put together of the initial cost of producing the planes. But it is anticipated that as Boeing or any other contractor gained experience producing the planes, they might learn how to produce them more cheaply. If so, and this could be anticipated, then the government would like to take this into account in deciding how much to contract to pay Boeing for the production of a large number of the planes. One can also see that if firms compete in an industry, their strategy regarding how high a price to charge for their product might depend on how much they expect to learn from production. If they expect to learn a lot, and this will enable them to lower their cost of production in the future, then they might want to charge a lower price now in order to get a contract and begin the learning process. So the key question is whether organizations in fact learn about how to produce goods more efficiently over time and what conditions the rate at which they learn.

Approach: A useful setting in which organizational learning has been studied is the production of Liberty Ships during World War II. This is a celebrated case because allegedly little changed over time in terms of the methods used to produce Liberty Ships, so that any reduction in the cost of producing them during the War was attributable to organizational learning. For the first 30 months Liberty Ships were produced, monthly data are available on the number of Liberty Ships produced and the cost per ship. The question here is how to construct a measure of experience to test whether experience influences the cost of production. Alternative hypotheses have been posed about how experience operates to lower cost. One hypothesis is that experience depends on the total number of units of output, in this case ships, that have been produced in the past. So at any given time, the cost of production will be lower the greater the total number of ships previously produced. Another hypothesis is that learning requires time, so that the cost of production at any given moment will be lower the greater the total time Liberty ships have been produced. A third hypothesis is that organizational learning only occurs when organizations are faced with novel situations, in particular when they are asked to produce a higher level of output in a given time period than they have ever been asked to do in the past. In the case of Liberty ships, this hypothesis implies that the cost of production will be determined by the highest level of monthly output of Liberty ships that has ever previously been produced. The challenge here is to figure out how to measure experience corresponding to each of these alternative hypotheses based on the monthly data and then formulate a test for each of the hypotheses to ascertain which seems to best describe the process of organizational learning. Further, the tests will need to be used to assess whether in fact organizational learning occurred in the production of Liberty Ships.

ENTER YOUR OLI ACCOUNT ID

ENTER YOUR PASSWORD
Forgot your password?
Carnegie Mellon Login through WebISO
Open Learning Iniative at Carnegie Mellon University. 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213. Funding for the Open Learning Initiative at Carnegie Mellon has been provided by The William and Flora Hewlett Foundation. Open Learning Initiative Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213 Funding for the Open Learning Initiative at Carnegie Mellon has been provided by The William and Flora Hewlett Foundation. The William and Flora Hewlett Foundation This work is licensed under the Creative Commons Attribution-Noncommercial Share Alike 3.0 License