Carnegie Mellon University
Home › Curriculum

Cutting-Edge Curriculum

Transforming professionals into data-driven decision makers

At Carnegie Mellon University, we believe that anyone can harness the power of data - from the seasoned programmer in financial services to the non-technical manager in HR. To transform professionals into data-driven decision makers, our faculty have designed the coursework for this certificate with evidence-based pedagogical practices in mind.

These practices stem from the research conducted in CMU’s world-renowned Statistics Pedagogy Lab Group, a group that seeks to modernize the statistics curriculum and spearhead the development of pedagogical tools and practices that benefit students from all backgrounds.

Curriculum Overview

After you enroll in the Foundations of Data Science graduate certificate, you will take five graduate-level, credit-bearing courses. Each course will appear on your Carnegie Mellon transcript with the grade earned.

To earn the certificate, you must successfully complete all courses in the program. If you are only interested in one course, however, you may complete that course only and it will show on your transcript with the grade earned. 

The certificate includes the following courses taught by CMU faculty:

Course Number: 36-640

Units: 6 units

Learn how to understand and correctly apply fundamental terminology and techniques in future data analysis situations. Explore the theoretical aspects of probability and statistical inference, including basic probability, random variables, univariate and bivariate probability distributions, statistics, likelihood, point and interval estimation, hypothesis testing, and the frameworks underlying linear and logistic regression and Naive Bayes. Mathematical details are supplemented with computer-based examples and exercises (e.g. visualization and simulation). 

Course Number: 36-641 

Units: 6 units

Prerequisite: Probability & Statistics for Data Science

Designed to teach you how to approach and analyze data, topics include data input/output, processing, exploratory analysis, clustering, common regression and classification models (including those of classical statistics and of machine learning), and experimental design. Practice using these methods on real-world data and subsequently apply them when analyzing data in the program's Data Science Capstone course.

Course Number: 36-642

Units: 6 units

Explore the most common forms of graphical displays and their (mis)uses. Learn how to create well-designed graphs and understand them from a statistical perspective, while working with increasingly common, complex data structures (temporal, spatial, and text data). All assignments will be completed in R and/or Python. Throughout the course, communication skills will play an important role.

Course Number: NA

Units: 6 units

Learn how to apply computational thinking to data processing and analysis problems through common programming languages (R and Python). Topics include defining and manipulating vectors, lists, and data frames; processing strings and applying regular expressions in string searches; input and output data; writing functions; applying numerical methods such as integration and optimization; working with data-and-time-based data; and applying unit testing.

Course Number: NA

Units: 6 units

Prerequisites: The first four courses must be completed prior to taking the Capstone course.  

In the capstone course, work with real-world data to apply the skills and knowledge acquired throughout the program. Supported by subject matter experts, you will have the opportunity to practice synthesizing and communicating results in a clear and concise manner. 

 

Meet Our World-Class Faculty

Dr. Peter FreemanDr. Peter Freeman

Education: Ph.D., University of Chicago

Research Areas: Astrostatistics, Statistics Pedagogy, Statistics for Physical Sciences

Dr. Ron YurkoDr. Ron Yurko

Education: Ph.D., Carnegie Mellon University

Research Areas: Sports Analytics, Statistical Genetics, Selective Inference

The Building Blocks of Our Curriculum

spiral-learning-fds.png


Spiral Learning

Spiral learning is an evidence-based pedagogical practice that helps students develop a deep understanding of data science in a short period of time. This approach emphasizes the importance of progressive learning instead of immediate proficiency in core concepts. Throughout the curriculum, our professors revisit or ‘spiral back’ to key topics continuously, each time encouraging a deeper understanding of the material. By repeatedly discussing topics in greater detail, you will start to build mastery in fundamental data science skills that you can apply in the future.

experiential-learning-fds.png


Experiential Learning

The Graduate Certificate in Foundations of Data Science is anything but traditional. Here, you will learn how to think like a data scientist by immersing yourself in a rich environment filled with hands-on experiences. Throughout the curriculum, you will practice statistical analysis techniques on real-world datasets; use programming languages like R and Python to create statistical graphs; and learn how to communicate your findings to various stakeholders. In the capstone course, you will work with real world data under the guidance of subject matter experts to apply the skills you have gained throughout the program.

real-world-context-fds.png


Real-World Context

While it’s important to develop technical skills in statistical analysis, data visualization and computational thinking, we also believe in the power of application. How can data science be used to solve real societal and organizational issues? In our program, students from all industries—whether it be finance, marketing, healthcare and beyond—are empowered to approach problems with a confident, data-centric mindset. With fundamental knowledge in data science, you’ll have the skills to apply data in meaningful ways for your team and organization.