Carnegie Mellon University

Other Helpful Resources

Whether you’re just getting started or are in need of sophisticated data analysis tools, below is a list of helpful websites and software that can help you with data collection and analysis. Please note, not all of the tools that we list here are available for free.

Know a great tool that we missed? Email us and we’ll add it to the list to share with our growing community of instructors, course designers, curriculum developers, learning engineers, educational technologists and learning researchers.


Resources to help you collect data

The Open Learning Initiative offers free online courses and materials that can be used by individuals or by college or university instructors who want to supplement their classroom instruction. The careful design of interactive activities that help students learn-by-doing is not only helpful for students, but also produces data that is particularly useful for analysis.

LEARN MORE

The Cognitive Tutor Authoring Tool, or CTAT, is a tool suite that enables you to add learning by doing (i.e., active learning) to online courses. CTAT supports the creation of flexible tutors for both simple and complex problem solving, capable of supporting multiple strategies that students may draw on when solving tutor problems. CTAT tutors track students as they work through problems and provide context-sensitive, just-in-time help.

CTAT supports the creation of two types of tutors: example-tracing tutors, which can be created without programming but require problem-specific authoring, and cognitive tutors, which require AI programming to build a cognitive model of student problem solving but support tutoring across a range of problems. The two types of tutors are roughly behaviorally equivalent.

CTAT-based tutors can be created in Java or Flash. It is also possible to use CTAT to add tutoring to an existing simulator or problem-solving environment (see CycleTalk for an example of this approach).

CTAT tutors are seamlessly integrated with DataShop. Tutors developed with CTAT write detailed logs of the student-tutor interactions, suited for teachers or researchers. This logging capability does not require any extra effort by tutor authors. The Data Shop provides support for detailed analysis of students' learning trajectories, based on the log data. For those who prefer to do logging only, CTAT provides a separate library with logging functions.

CTAT tutors can be delivered via OLI (Open Learning Initiative) courses, but a number of other delivery options are available, including standard web delivery.

LEARN MORE

Founded by cognitive and computer scientists from Carnegie Mellon University in conjunction with veteran mathematics teachers, Carnegie Learning has reinvented the traditional way of teaching math.

The curricula is based on more than 20 years of research into how students think, learn, and apply new knowledge in mathematics. Carnegie Learning continues to partner with Carnegie Mellon. They actively participate in the scientific community, frequently sharing results in refereed journals and at conferences. Carnegie Learning collects 250+ million student observations annually. They continuously collect and analyze data and feedback from schools to enhance our curricula and help teach more creatively and efficiently.

LEARN MORE


Resources to help you analyze data

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.

LEARN MORE

SPSS is a family of software packages for statistical analysis, data and text mining, predictive modeling and decision optimization. Statistics included in the base software:

  • Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics
  • Bivariate statistics: Means, t-test, ANOVA, Correlation (bivariate, partial, distances), Nonparametric tests
  • Prediction for numerical outcomes: Linear regression
  • Prediction for identifying groups: Factor analysis, cluster analysis (two-step, K-means, hierarchical), Discriminant

LEARN MORE

Excel is a spreadsheet program coupled with a data analysis plug in that provides the following analysis for spreadsheets in a workbook:

  • Anova
  • Correlation
  • Covariance
  • Descriptive Statistics
  • Exponential Smoothing
  • F-Test Two-Sample for Variances
  • Fourier Analysis
  • Histogram
  • Moving Average
  • Random Number Generation
  • Rank and Percentile
  • Regression
  • Sampling
  • t-Test
  • z-Test

LEARN MORE

LightSide began as an open source text mining and machine learning tool. The core technology is freely available. The workbench, as well as the core technology for machine learning and feature extraction, was developed under grants from National Science Foundation and the Office of Naval Research, through Carnegie Mellon University’s Language Technologies Institute. This core is available as GPLv3 open source on Bitbucket

LEARN MORE

This tool computes Cronbach Alpha for a Datashop dataset. Based on item-elimination Cronbach Alpha values, it ranks item reliability within a dataset. It takes a dataset ID as required input and extracts data from Datashop automatically for you and uploads the results to Datashop per your request.

LEARN MORE

This tool is a simplified version of Michael Yudelson's Bayesian Knowledge Tracing tool. It uses "solver 1.2" (skill-gradient descent) and default for all other parameters defined in Yudelson's BKT tool. This takes dataset ID and skill model ID as inputs and extract data from Datashop automatically for you and upload the BKT results to Datashop per your request.

LEARN MORE

Bayesian Knowledge Tracing is one of the most popular student modeling approaches in the field of Educational Data mining. It is in wide use in the Intelligent Tutoring Systems community.

This toolkit is an attempt to make fitting BKT models on BigData feasible.

Versions are available for Windows, Linux, and Mac executables. As well as tutorial slides (watch tutorial here). Sources can be found on GitHub.

LEARN MORE

Code for computing A'/AUC that does not fail for cases where multiple data points have same confidence. Also can compare A'/AUC across models without violating independence assumption.

LEARN MORE

A toolkit for Bayesian Knowledge Tracing. Implements the Brute Force/Grid Search approach used in Baker et al. (2010, 2011a, 2011b, 2011c), Pardos et al. (2011, 2012), Gowda et al. (2011), and many other papers. This algorithm is approximately as effective as Expectation Maximization and runs faster.

LEARN MORE

This tool distills features used by Ryan Baker's group for affect, behavior, motivational, and meta-cognitive detectors. Designed for DataShop data and for Cognitive Tutor data, it can be modified to run with other LearnLabs and learning environments.

LEARN MORE

Tool for distilling log data and conducting text replays to label constructs in log data. Can take DataShop data as input. Runs as jar.

LEARN MORE

Android app for recording learner affect and engagement while using educational software. Has also been used to study kindergartners participating in classroom activities, and to study teacher behaviors.

LEARN MORE