Raja Sooriamurthi and Xiaoying Tu

Raja Sooriamurthi
Teaching Professor
Information Systems
Heinz School of Information Systems and Public Policy
Fall 2024

Xiaoying Tu
Assistant Teaching Professor
Information Systems
Heinz School of Information Systems and Public Policy
Fall 2024
67-262 Database Design and Development (14-week course)
Research Question(s):- To what extent does the source of feedback (instructor vs generative AI) affect Structured Query Language (SQL) assignment and exam performance?
- Does the source of feedback affect student perceptions about the usefulness of and comfort during the feedback session?
- Does the source of feedback impact the development of students’ self-efficacy?
Students uploaded their coding assignment deliverables to a customized, genAI chatbot called the Intelligent Assessor (designed by Sooriamurthi and Tu). Instructors fine-tuned the chatbot using the assignment rubric, their paper detailing the three-step heuristic process of formulating any SQL inquiry, SQL style guidelines, and documentation of mistakes made by previous students. The customized chatbot asked each student questions about their individual assignment responses, probing them to describe their thinking and decision process. For each student, the chatbot created unique follow-up questions encouraging the student to essentially “think out loud”.
Study Design:After completing a SQL assignment, Sooriamurthi and Tu randomly assigned students to debrief and receive feedback on their assignment from an instructor or the customized genAI chatbot. Students then completed another SQL assignment and debriefed in the counterbalanced condition. Following each debriefing session, students responded to questions about the experience and the value of the feedback received.
Sample size: Total sample (33 students, randomly assigned to alternating conditions)
Data Sources:
- Rubric scores from students’ three SQL assignment deliverables
- Pre/post surveys of students’ self-efficacy for working with SQL, administered at the beginning of the course and after both debrief sessions
- Post surveys of students’ perceptions of the value of feedback received and comfort with the feedback interaction, administered after each debrief session
- The source of feedback (instructor vs genAI) did not affect performance on either the SQL assignments or the exam.
Figure 1. Students’ performance on three SQL assignments showed a significant main effect of Assignment (F(2, 62) = 8.53, p < .001, η2p = .22) in which performance on Assignment 1 was significantly lower than that on Assignment 2 and 3 (p < .001 and .03, respectively). Time did not significantly interact with Order of Conditions (F(2, 62) = .22, p = .81) were significant. Error bars are 95% confidence intervals for the means.
- Students consistently reported feeling less nervous receiving feedback from the genAI than the instructor, regardless of whether this experience came first or second in the two feedback sessions. In agreement with this, students reported being more comfortable during the genAI feedback session rather than the Instructor feedback session. There was no difference between feedback conditions on how useful they perceived the feedback to be, how much they believed the feedback deepened their understanding, their enjoyment, their intention to make revisions to their process, or their interest in having this type of feedback again.
- Students significantly grew in their self-efficacy from the beginning of the semester to the end of the first feedback session. They maintained their increased self-efficacy for working with SQL to the end of the second feedback session but did not show significant further growth. The source of their feedback (instructor vs genAI) did not affect the trajectory of their self-efficacy growth.
Figure 2. Students’ self-efficacy (0-100% confidence) for working with SQL showed a significant main effect of Time (F(1.176, 29.407) = 29.341, p < .001, η2p = .54) in which self-efficacy significantly grew from pre to post Feedback 1 and post Feedback 2 (ps < .001), but the growth from Feedback 1 to Feedback 2 was nonsignificant (p = .29). Time did not significantly interact with Order of Conditions (F(1.176, 29.407) = .78, p = .40). Error bars are 95% confidence intervals for the means.
Eberly Center’s Takeaways:
- RQ1: Debriefing a SQL assignment with a customized genAI chatbot (the Intelligent Assessor) did not affect SQL performance as compared to feedback from a course instructor. Performance was high to begin with, however, meaning there was little room for improvement. It would be useful to test performance in a control condition in which students do not receive feedback in order to determine if both groups in the present study experienced similar positive effects or whether performance did not change at all.
- RQ2 and RQ3: Although performance and self-efficacy did not change, using this kind of customized genAI may be a viable option for giving personalized feedback in large classes. There may be an added benefit of using genAI in this way by reducing student nervousness when receiving feedback (as compared to interacting with a course instructor). Of important note, however, is that this genAI chatbot was carefully fine-tuned to help ensure accuracy for SQL material, to maintain an encouraging persona with students, and to guide students toward understanding rather than giving answers directly.