Predicting student disengagement: Harnessing visual cues for intelligent tutoring systems

Abstract views: 76 / PDF downloads: 64


  • Mehmet Firat



Student Engagement, Visual Cues, Intelligent Tutoring Systems


Intelligent tutoring systems have the potential to enhance the learning experience for children, but it is crucial to detect and address early signs of disengagement to ensure effective learning. In this paper, we propose a method that utilizes visual features from a tablet tutor's user-facing camera to predict whether a student will complete the current activity or disengage from it. Unlike previous approaches that relied on tutor-specific features, our method leverages visual cues, making it applicable to various tutoring systems. We employ a deep learning approach based on a Long Short Term Memory (LSTM) model with a target replication loss function for prediction. Our model is trained and tested on screen capture videos of children using a tablet tutor for learning basic Swahili literacy and numeracy in Tanzania. With 40% of the activity remaining, our model achieves a balanced-class size prediction accuracy of 73.3%. Furthermore, we analyze the variation in prediction accuracy across different tutor activities, revealing two distinct causes of disengagement. The findings indicate that our model can not only predict disengagement but also identify visual indicators of negative affective states that may not lead to non-completion of the task. This work contributes to the automated detection of early signs of disengagement, which can aid in improving tutoring systems and guiding pedagogical decisions in real-time.


Agarwal, M., Mostow, J. (2020). Semi-supervised learning to perceive children’s affective states in a tablet tutor. In: Thirty-Fourth AAAI Conference on Artificial Intelligence, pp. 13350–13357.

Baltruˇsaitis, T., Robinson, P., Morency, L.P. (2016). OpenFace: an open source facial behavior analysis toolkit. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–10. IEEE.

Boote, B., Agarwal, M., Mostow, J. (2021). Early Prediction of Children’s Task Completion in a Tablet Tutor using Visual Features (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 35(18), 15761-15762.

Bosch, N., D’Mello, S. (2019). Automatic detection of mind wandering from video in the lab and in the classroom. IEEE Trans. Affect. Comput.

Liang, W.C., Yuan, J., Sun, D.C., Lin, M.H. (2009). Changes in physiological parameters induced by indoor simulated driving: effect of lower body exercise at mid-term break. Sensors, 9(9), 6913–6933.

McReynolds, A.A., Naderzad, S.P., Goswami, M., Mostow, J. (2020). Toward learning at scale in developing countries: lessons from the global learning XPRIZE field study. In: Proceedings of the Seventh ACM Conference on Learning@ Scale, pp. 175–183.

Thomas, C., Jayagopi, D.B. (2017). Predicting student engagement in classrooms using facial behavioral cues. In: Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education, pp. 33–40.




How to Cite

Mehmet Firat. (2023). Predicting student disengagement: Harnessing visual cues for intelligent tutoring systems. London Journal of Social Sciences, (6), 75–83.