Year 1 - Fall
Students are introduced to the faculty and their interests, the field of statistics, and the facilities at Carnegie Mellon. Each faculty member gives at least one elementary lecture on some topic of his or her choice. In the past, topics have included: the field of statistics and its history, large-scale sample surveys, survival analysis, subjective probability, time series, robustness, multivariate analysis, psychiatric statistics, experimental design, consulting, decision-making, probability models, statistics and the law, and comparative inference. Students are also given information about the libraries at Carnegie Mellon and current bibliographic tools. In addition, students are instructed in the use of the Departmental and University computational facilities and available statistical program packages.
This course covers the basics of statistics. We will first provide a quick introduction to probability theory, and then cover fundamental topics in mathematical statistics such as point estimation, hypothesis testing, asymptotic theory, and Bayesian inference. If time permits, we will also cover more advanced and useful topics including nonparametric inference, regression and classification. Prerequisites: one- and two-variable calculus and matrix algebra.
This course covers the fundamentals of theoretical statistics. Topics include: probability inequalities, point and interval estimation, minimax theory, hypothesis testing, data reduction, convergence concepts, Bayesian inference, nonparametric statistics, bootstrap resampling, VC dimension, prediction and model selection.
This course covers the basic principles of causality. Foundations of linear regression, including theory, computation, diagnostics, and generalized linear models. Extensions to nonparametric regression, including splines, kernel regression, and generalized additive models. Discussion of tools to compare statistical models, including hypothesis tests, cross-validation, and bootstrapping. Topics in nonparametric regression and machine learning as time permits, such as regression trees, boosting, and random forests. Emphasis on writing data analysis reports that answer substantive scientific methods with appropriate statistical tools. Students will be equipped with the tools needed to explore a substantive scientific question with data, translate scientific questions into statistical questions, compare different modeling approaches rigorously, and write their results in a clear manner.
A detailed introduction to elements of computing relating to statistical modeling, targeted to PhD students and masters students in Statistics & Data Science. Topics include important data structures and algorithms; numerical methods; databases; parallelism and concurrency; and coding practices, program design, and testing. Multiple programming languages will be supported (e.g., C, R, Python, etc.). Those with no previous programming experience are welcome but will be required to learn the basics of at least one language via self-study.