Back to Top
Top Nav content Site Footer
University Home

University Archives

Poster Presentation

College of Engineering & Science

Yousif, Jacob, and Caitlin Snyder. "Evaluating Student Learning Models Using Synthetic Data and Bayesian Knowledge Tracing."

Collecting real student learning data to test and evaluate intelligent tutoring systems presents numerous challenges, including privacy concerns, limited accessibility, and ethical restrictions on data sharing. These barriers make it difficult for researchers to experiment with learning models or validate educational technologies at scale. To address this problem, we explored the use of synthetic data generation as a way to simulate realistic student learning behaviors without compromising privacy. We began by designing a set of educational questions that could serve as the foundation for synthetic learning data and leveraged the ChatGPT API to generate synthetic data through the use of prompt engineering and various levels of testing. The API was given different student personalities to choose from at random and would then generate data for 50 students. We then used the data to apply a Bayesian Knowledge Tracing model, which enabled us to determine the general learning statistics of these students. The model would determine the students’ prior knowledge, learning rate, mastery, etc. By doing this, we were able to determine some general statistics for how students learn, and how the use of synthetic data could be useful for researchers who don’t have ready access to real data. Next steps that could be taken include using larger datasets, comparing real data to synthetic data, and evaluating how a Bayesian Knowledge Tracing model compares to other learning statistic models.

Back to Top