18-899-K4 Big Data Science-Carnegie Mellon University Africa - Carnegie Mellon University

18-899-K4 Big Data Science


Special Topics in Signal Processing: Big Data Science

Course discipline: Electrical and Computer Engineering

Units: 6
Lecture/Lab/Rep hours/week: 
3 Lecture Hours Per Week
3 Lab Hours Per Week
Semester/year offered (fall/spring, even/odd/all years): Spring
Pre-requisites:  Data and Inference and Applied Machine Learning Mini-Courses; Background in quantitative discipline (Engineering, Computer Science, Physics, Mathematics, Statistics); Programming.

Course description: 

The proliferation of mobile technology, wireless sensors and social media provides a means of monitoring socio-economic activity, consumption of resources and human mobility.  Recent advances in data science are now capable of coping with the technical challenges of collecting, managing and developing actionable insights from big data. Partnerships between academia, government and the private sector are at the heart of the revolution that is currently demonstrating how data is a valuable commodity and a source of intellectual property. This course will take a practical approach to solving challenges in the public and private sectors using a collection of techniques that constitute this new multidisciplinary field known as data science.  A number of different themes will be explored as case studies in order to demonstrate how big data collected from a wide range of disparate sources can be combined to provide insights, drive decisions and influence policy. The course content will be structured to provide a roadmap for deploying data science techniques using case studies, reading material and previously published models. Participants will obtain hands-on experience by working on real-world datasets during assignments.  

Learning objectives:

The objective of this course is to provide students with practical experience of the different techniques and skills that constitute the field of data science.  In particular, these case studies are selected to demonstrate the technical challenges of dealing with the three V’s that define big data (volume, velocity and variety).  The various steps required will include: (1) exploration of data using visualization techniques; (2) construction of features; (3) evaluation of a collection of models; and (4) consideration of how a decision-maker can utilize the analysis; and (5) development of a dashboard for displaying the results of the analysis.  The sources of big data will range from surveys to mobile data to satellite imagery and therefore involve both structured and unstructured data.


After completing this course, students should be able to: 
  • Identify sources of big data in response to a specific challenge
  • Download and organize data for addressing the challenge
  • Explore the dataset using visualization techniques
  • Develop a number of features to extract information
  • Construct a range of quantitative models
  • Discuss the advantages and disadvantages of different models
  • Select an approach that is optimal for meeting the objective
  • Present conclusions and recommendations
  • Communicate model output to decision-makers

Content details:

1. Weather and climate impacts

2. Survey data

3. Google trends

4. Sentiment analysis

5. Mobile data

6. Big data for development

Faculty: Patrick McSharry