TY - GEN
T1 - An Ensemble Framework for Dropout Prediction in Online Learning
AU - Srinivasan, Sruthi
AU - Dewan, M. Ali Akber
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Online learning has gained traction over recent years, especially since online education has become more widespread. However, it comes with its own set of challenges of which high dropout is still a major one. Identifying at-risk learners at an early stage is pivotal to offering personalized attention that can potentially prevent them from dropping out from the online courses. This work proposes two methods to analyze students' progress in an online course and subsequently identify dropout prone students. The first method performs fusion of course activity features by concatenating previous weeks' features before training. The second method extracts course activity features from the start date of the courses to a current week instead of concatenating as the first method does. A set of machine learning models and an ensemble framework were trained and tested on these two types of feature sets. On evaluating the models, the benchmark dataset KDDCup15 has been used, where the first method yielded an F1-score 91% while the second method yielded a score 92%. It was observed that both feature fusion methods produce comparable results although we expected that the concatenation of the features over time would produce better results. We also found that using features over a longer duration of time can help in achieving better performance. Further, ensemble model consistently outperformed the base classifiers.
AB - Online learning has gained traction over recent years, especially since online education has become more widespread. However, it comes with its own set of challenges of which high dropout is still a major one. Identifying at-risk learners at an early stage is pivotal to offering personalized attention that can potentially prevent them from dropping out from the online courses. This work proposes two methods to analyze students' progress in an online course and subsequently identify dropout prone students. The first method performs fusion of course activity features by concatenating previous weeks' features before training. The second method extracts course activity features from the start date of the courses to a current week instead of concatenating as the first method does. A set of machine learning models and an ensemble framework were trained and tested on these two types of feature sets. On evaluating the models, the benchmark dataset KDDCup15 has been used, where the first method yielded an F1-score 91% while the second method yielded a score 92%. It was observed that both feature fusion methods produce comparable results although we expected that the concatenation of the features over time would produce better results. We also found that using features over a longer duration of time can help in achieving better performance. Further, ensemble model consistently outperformed the base classifiers.
KW - Machine learning models
KW - dropout prediction
KW - ensemble classifier
KW - feature fusion
UR - http://www.scopus.com/inward/record.url?scp=85150684926&partnerID=8YFLogxK
U2 - 10.1109/ICKECS56523.2022.10059775
DO - 10.1109/ICKECS56523.2022.10059775
M3 - Published Conference contribution
AN - SCOPUS:85150684926
T3 - IEEE International Conference on Knowledge Engineering and Communication Systems, ICKES 2022
BT - IEEE International Conference on Knowledge Engineering and Communication Systems, ICKES 2022
T2 - 2022 IEEE International Conference on Knowledge Engineering and Communication Systems, ICKES 2022
Y2 - 28 December 2022 through 29 December 2022
ER -