Real-time Multi-module Student Engagement Detection System

Pooja Ravi, M. Ali Akber Dewan

Research output: Chapter in Book/Report/Conference proceedingPublished Conference contributionpeer-review


We present a method to aggregate four different facial cues to help identify distraction among online learners: facial emotion detection, micro-sleep tracking, yawn detection, and iris distraction detection. In our proposed method, the first module identifies facial emotions using both 2D and 3D convolutional neural networks (CNNs) which facilitates comparison between spatiotemporal and solely spatial features. The other three modules use a 3D facial mesh to localize the eye and lip coordinates in order to track a student’s facial landmarks and identify iris positions as well as signs of micro-sleep like yawns or drowsiness. The results from each module are combined to form an all-encompassing label displayed on an integrated user interface that can further be used to provide real-time alerts to students and instructors when required. From our experiments, the emotion, micro-sleep, yawn, and iris monitoring modules individually achieved 72.5%, 95%, 97%, and 93% accuracy scores, respectively.

Original languageEnglish
Title of host publicationCommunication and Intelligent Systems - Proceedings of ICCIS 2022
EditorsHarish Sharma, Vivek Shrivastava, Kusum Kumari Bharti, Lipo Wang
Number of pages18
Publication statusPublished - 2023
Event4th International Conference on Communication and Intelligent Systems, ICCIS 2022 - New Delhi, India
Duration: 19 Dec. 202220 Dec. 2022

Publication series

NameLecture Notes in Networks and Systems
Volume686 LNNS
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389


Conference4th International Conference on Communication and Intelligent Systems, ICCIS 2022
CityNew Delhi


  • 2D and 3D CNNs
  • Facial landmark detection
  • Online learning
  • Spatiotemporal features
  • Student engagement detection


Dive into the research topics of 'Real-time Multi-module Student Engagement Detection System'. Together they form a unique fingerprint.

Cite this