Publications
Mining patterns in syntax trees to automate code reviews of student solutions for programming exercise
Abstract
In programming education, providing manual feedback is essential but labour-intensive, posing challenges in consistency and timeliness. We introduce ECHO, a machine learning method to automate the reuse of feedback in educational code reviews by analysing patterns in abstract syntax trees. This study investigates two primary questions: whether ECHO can predict feedback annotations to specific lines of student code based on previously added annotations by human reviewers (RQ1), and whether its training and prediction speeds are suitable for using ECHO for real-time feedback during live code reviews by human reviewers (RQ2). Our results, based on annotations from both automated linting tools and human reviewers, show that ECHO can accurately and quickly predict appropriate feedback annotations. Its efficiency in processing and its flexibility in adapting to feedback patterns can significantly reduce the time and effort required for manual feedback provisioning in educational settings.
Citation
Van Petegem, C., Demeyere, K., Maertens, R., Strijbol, N., De Wever, B., Mesuere, B., Dawyndt, P., 2024. arXiv preprint. https://doi.org/10.48550/arXiv.2402.10853https://doi.org/10.48550/arXiv.2402.10853
Blink: An educational software debugger for Scratch
Abstract
The process of teaching children to code is often slowed down by the delay in providing feedback on each student’s code. Especially in larger classrooms, teachers often lack the time to give individual feedback to each student. That is why it is important to equip children with tools that can provide immediate feedback and thus enhance their independent learning skills. This article presents Blink, a debugging tool specifically designed for Scratch, the most commonly taught programming language for children. Blink comes with basic debugging features such as ‘step’ and ‘pause’, allowing precise monitoring of the execution of Scratch programs. It also provides users with more advanced debugging options, such as back-in-time debugging and programmable pause. A group of children attending an extracurricular coding class have been testing the usefulness of Blink. Feedback from these young users indicates that Blink helps them pinpoint programming errors more accurately, and they have expressed an overall positive view of the tool.
Citation
Strijbol, N., De Proft, R., Goethals, K., Mesuere, B., Dawyndt, P., & Scholliers, C. (2024). Blink: An educational software debugger for Scratch. SoftwareX, 25. https://doi.org/10.1016/j.softx.2023.101617.
Reproducing Predictive Learning Analytics in CS1: Toward Generalizable and Explainable Models for Enhancing Student Retention
Abstract
Predictive learning analytics has been widely explored in educational research to improve student retention and academic success in an introductory programming course in computer science (CS1). General-purpose and interpretable dropout predictions still pose a challenge. Our study aims to reproduce and extend the data analysis of a privacy-first student pass–fail prediction approach proposed by Van Petegem and colleagues (2022) in a different CS1 course. Using student submission and self-report data, we investigated the reproducibility of the original approach, the effect of adding self-reports to the model, and the interpretability of the model features. The results showed that the original approach for student dropout prediction could be successfully reproduced in a different course context and that adding self-report data to the prediction model improved accuracy for the first four weeks. We also identified relevant features associated with dropout in the CS1 course, such as timely submission of tasks and iterative problem solving. When analyzing student behaviour, submission data and self-report data were found to complement each other. The results highlight the importance of transparency and generalizability in learning analytics and the need for future research to identify other factors beyond self-reported aptitude measures and student behaviour that can enhance dropout prediction.
Citation
Zhidkikh, D., Heilala, V., Van Petegem, C., Dawyndt, P., Järvinen, M., Viitanen, S., De Wever, B., Mesuere, B., Lappalainen, V., Kettunen, L., & Hämäläinen, R. (2024). Reproducing Predictive Learning Analytics in CS1: Toward Generalizable and Explainable Models for Enhancing Student Retention. Journal of Learning Analytics, 1–21. https://doi.org/10.18608/jla.2024.7979
Dodona: Learn to code with a virtual co-teacher that supports active learning
Abstract
Dodona (dodona.ugent.be) is an intelligent tutoring system for computer programming. It provides real-time data and feedback to help students learn better and teachers teach better. Dodona is free to use and has more than 61 thousand registered users across many educational and research institutes, including 20 thousand new users in the last year. The source code of Dodona is available on GitHub under the permissive MIT open-source license. This paper presents Dodona and its design and look-and-feel. We highlight some of the features built into Dodona that make it possible to shorten feedback loops, and discuss an example of how these features can be used in practice. We also highlight some of the research opportunities that Dodona has opened up and present some future developments.
Citation
Van Petegem, C., Maertens, R., Strijbol, N., Van Renterghem, J., Van der Jeugt, F., De Wever, B., Dawyndt, P., & Mesuere, B. (2023). Dodona: Learn to code with a virtual co-teacher that supports active learning. SoftwareX, 24, 101578. https://doi.org/10.1016/j.softx.2023.101578
Discovering and exploring cases of educational source code plagiarism with Dolos
Abstract
Source code plagiarism is a significant issue in educational practice, and educators need user-friendly tools to cope with such academic dishonesty. This article introduces the latest version of Dolos, a state-of-the-art ecosystem of tools for detecting and preventing plagiarism in educational source code. In this new version, the primary focus has been on enhancing the user experience. Educators can now run the entire plagiarism detection pipeline from a new web app in their browser, eliminating the need for any installation or configuration. Completely redesigned analytics dashboards provide an instant assessment of whether a collection of source files contains suspected cases of plagiarism and how widespread plagiarism is within the collection. The dashboards support hierarchically structured navigation to facilitate zooming in and out of suspect cases. Clusters are an essential new component of the dashboard design, reflecting the observation that plagiarism can occur among larger groups of students. To meet various user needs, the Dolos software stack for source code plagiarism detection now includes a self-hostable web app, a JSON application programming interface (API), a command line interface (CLI), a JavaScript library and a preconfigured Docker container. Clear documentation and a free-to-use instance of the web app can be found at https://dolos.ugent.be. The source code is also available on GitHub.
Citation
Maertens, R., Van Neyghem, M., Geldhof, M., Van Petegem, C., Strijbol, N., Dawyndt, P., Mesuere, B., 2024. Discovering and exploring cases of educational source code plagiarism with Dolos. SoftwareX 26, 101755. https://doi.org/10.1016/j.softx.2024.101755
TESTed — An educational testing framework with language-agnostic test suites for programming exercises
Abstract
In educational contexts, automated assessment tools (AAT) are commonly used to provide formative feedback on programming exercises. However, designing exercises for AAT remains a laborious task or imposes limitations on the exercises. Most AAT use either output comparison, where the generated output is compared against an expected output, or unit testing, where the tool has access to the code of the submission under test. While output comparison has the advantage of being programming language independent, the testing capabilities are limited to the output. Conversely, unit testing can generate more granular feedback, but is tightly coupled with the programming language of the submission. In this paper, we introduce TESTed, which enables the best of both worlds: combining the granular feedback of unit testing with the programming language independence of output comparison. Educators can save time by designing exercises that can be used across programming languages. Finally, we report on using TESTed in educational practice.
Citation
Strijbol, N., Van Petegem, C., Maertens, R., Sels, B., Scholliers, C., Dawyndt, P., & Mesuere, B. (2023). TESTed—An educational testing framework with language-agnostic test suites for programming exercises. SoftwareX, 22, 101404. https://doi.org/10.1016/j.softx.2023.101404
Pass/fail prediction in programming courses
Abstract
We present a privacy-friendly early-detection framework to identify students at risk of failing in introductory programming courses at university. The framework was validated for two different courses with annual editions taken by higher education students (N = 2 080) and was found to be highly accurate and robust against variation in course structures, teaching and learning styles, programming exercises and classification algorithms. By using interpretable machine learning techniques, the framework also provides insight into what aspects of practising programming skills promote or inhibit learning or have no or minor effect on the learning process. Findings showed that the framework was capable of predicting students’ future success already early on in the semester.
Citation
Van Petegem, C., Deconinck, L., Mourisse, D., Maertens, R., Strijbol, N., Dhoedt, B., De Wever, B., Dawyndt, P., & Mesuere, B. (2022). Pass/Fail Prediction in Programming Courses. Journal of Educational Computing Research, 68–95. https://doi.org/10.1177/07356331221085595
Dolos: Language-agnostic plagiarism detection in source code
Abstract
Learning to code is increasingly embedded in secondary and higher education curricula, where solving programming exercises plays an important role in the learning process and in formative and summative assessment. Unfortunately, students admit that copying code from each other is a common practice and teachers indicate they rarely use plagiarism detection tools. We want to lower the barrier for teachers to detect plagiarism by introducing a new source code plagiarism detection tool (Dolos) that is powered by state-of-the art similarity detection algorithms, offers interactive visualizations, and uses generic parser models to support a broad range of programming languages. Dolos is compared with state-of-the-art plagiarism detection tools in a benchmark based on a standardized dataset. We describe our experience with integrating Dolos in a programming course with a strong focus on online learning and the impact of transitioning to remote assessment during the COVID-19 pandemic. Dolos outperforms other plagiarism detection tools in detecting potential cases of plagiarism and is a valuable tool for preventing and detecting plagiarism in online learning environments. It is available under the permissive MIT open-source license at https://dolos.ugent.be.
Citation
Maertens, R., Van Petegem, C., Strijbol, N., Baeyens, T., Jacobs, A. C., Dawyndt, P., & Mesuere, B. (2022). Dolos: Language-agnostic plagiarism detection in source code. Journal of Computer Assisted Learning. https://doi.org/10.1111/jcal.12662