Carnegie Mellon University
Eberly Center

Teaching Excellence & Educational Innovation

Christopher McComb

Christopher McComb headshot

Associate Professor
College of Engineering
Spring 2025

24-262 Mechanics II: 3D Design (14-week course)

Research Question(s): 

  1. To what extent does the use of generative AI as a tutee increase learners’ mastery of key concepts in a core undergraduate mechanical engineering course?
  2. To what extent does student self-efficacy change over the course of a semester in a core undergraduate mechanical engineering course in which generative AI is used as a tutee?

Teaching Intervention with Generative AI (genAI):

McComb prompted genAI (Microsoft Copilot) to function as a tutee in a 14-week Mechanical Engineering course. Students were onboarded to best practices in peer tutoring and genAI prompting at the beginning of the semester. Over the course of the semester, students interacted with the genAI tool on eight homework assignments, using pre-engineered prompts that instructed the tool to make specific mistakes and to act like a novice student (i.e., tutee). Students then had to tutor the tool by identifying and correcting conceptual mistakes. To prevent students from knowing in advance what mistakes the genAI was prompted to make, students were given all initial prompts in Indonesian to enter into genAI. 

Study Design:

McComb taught the course during the Spring 2024 (control) and Spring 2025 (treatment) semesters. During Spring 2024, students received similar practice with course concepts by completing homework assignments without the genAI-as-tutee intervention. In Spring 2025, as part of the homework assignments, students were directed to take on the role of a tutor and interact with the genAI tool, identifying and correcting conceptual mistakes that the tool made. McComb compared student performance on two project design reports  and a concept inventory between the semesters, in addition to measuring students’ shift in self-efficacy in the treatment semester only.

Sample size: Treatment (130 students); Control (136 students) 

Data Sources:

  1. Students’ two design reports assessed with a rubric by the instructor.
  2. Students’ scores on The Mechanics of Materials Concept Inventory (MoMCI) taken at the end of the semester. 
  3. Pre and post surveys of students’ self-efficacy regarding engineering skills and genAI use (treatment semester only).


Findings

  1. RQ1: Controlling for GPA, students in the S25 (treatment) semester significantly outperformed students in S24 (control) on one of two design reports (project 2).  


    Figure 1. Controlling for student GPA, students in the S25 semester (M = 98.69 , SE = 0.29) outperformed students in the S24 semester (M = 97.02 , SE = 0.29) on project 2, F(2,261) = 16.69, p < .001, η2  = .06. There was no significant difference found between the S24 semester (M = 98.13 , SE = 0.21) and the S25 semester (M = 97.55, SE = 0.22) for project 1,  F(2,261) = 3.68, p = .06. Error bars are 95% confidence intervals for the means. 

    Controlling for GPA, students in the S25 (treatment) semester significantly outperformed students in the S24 (control) semester on the concept inventory taken at the end of their respective semesters. This finding was consistent across three of the four sub-skills (predicting deformation, predicting failure, influence of material properties), with the fourth skill (predicting location of failure) showing directional consistency but no significance.

    Figure 2. Controlling for student GPA, students in the S25 semester (M = 73.16 , SE = 1.66) outperformed students in the S24 semester (M = 65.55 , SE = 1.63 ) for overall score on the engineering concept inventory taken at the end of the semester, F(2,262) = 13.69, p < .001, η2  = .05. Three of the four different component skills, predicting deformation  F(2,262) = 15.49, p < .001, η2  = .06; predicting failure  F(2,262) = 5.74, p < .05, η2  = .02, and Influence of material properties  F(2,262) = 31.44, p < .001, η2  = .11 showed a significant difference between the semesters. Only one component skill did not show a significant difference, predicting location of failure, F(2,262) = 0.86, p = .36. Error bars are 95% confidence intervals for the means.

  2. RQ2: Students’ self-efficacy for both course-related skills and genAI use significantly increased from the beginning to the end of the course in the S25 (treatment) semester. 

  3. Figure 3. In the S25 (treatment) semester, students’ self-efficacy for course skills significantly improved from the beginning (M = 70.37, SD = 14.24) to the end (M = 85.03, SD = 9.49) of the semester, t (114) 11.59, p < .001, g = 1.07. These students’ self-efficacy for using genAI also increased from the beginning (M = 50.56, SD = 23.19) to the end (M = 73.18, SD =22.52) of the semester, t (109) 8.52, p < .001, g = 0.81. Error bars are 95% confidence intervals for the means.

Eberly Center’s Takeaways:

  1. RQ1: Tutoring the genAI tool consistently across the semester significantly aided students’ comprehension of course material, as indicated by improved performance on both a concept inventory and on one of two engineering project reports compared to a control group. While tutoring the genAI bot for four weeks did not improve performance on Report 1, tutoring across eight weeks significantly improved performance on Report 2. This suggests a possible dosage effect, or additive gains from learning-by-tutoring over time. Indeed, differences on the concept inventory performance, which was taken at the end of the semester, were even more pronounced than differences on Report 2. 


    Importantly, the concept inventory is a widely-used, validated instrument to measure students’ understanding of mechanics of materials. These findings further highlight the benefits of these sustained tutoring interactions with a genAI tool. Moreover, they are consistent with literature that suggests explaining material benefits learning and retention, highlighting a novel approach to using genAI in the classroom. While genAI’s widespread availability as a tutee is a benefit to instructors wishing to implement a sustained tutoring approach, careful and time-intensive engineering of an initial prompt is essential to ensure beneficial tutoring interactions.  

  2. RQ2: Although self-efficacy increased for all skills in the Spring 2025 (treatment) section, these data were not collected during the Spring 2024 (control) section. Therefore, we cannot say to what extent these pre/post increases are attributable to the intervention.