TY - GEN
T1 - Identifying student difficulty in a digital learning environment
AU - Harris, Steven C.
AU - Kumar, Vivekanandan
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/8/10
Y1 - 2018/8/10
N2 - This paper discusses the development of TutorAlert, a natural language processing system similar to those used in sentiment analysis, but applied to the data generated by students in a digital online learning environment in order to detect confused or frustrated students. A number of machine learning algorithms were tested in the development process, including Support Vector Machines (SVM), Naive Bayes, and Random Forest classifiers. As well, an array of natural language preparation techniques were employed to determine the optimum preprocessing configuration to produce relevant results. We found that detecting potential student frustration or confusion was most successful using a Sequential Minimal Optimization algorithm (SMO), along with the Stanford Part-Of-Speech Tagger (POS Tagger), the iterated version of the Lovins stemmer, and a custom dictionary to help determine relevance probability. This model produced a promising initial F1 score of 0.79 and an accuracy of 0.83. Further, agreement values of 88% were achieved during inter-rater reliability testing between the classifier and human judges.
AB - This paper discusses the development of TutorAlert, a natural language processing system similar to those used in sentiment analysis, but applied to the data generated by students in a digital online learning environment in order to detect confused or frustrated students. A number of machine learning algorithms were tested in the development process, including Support Vector Machines (SVM), Naive Bayes, and Random Forest classifiers. As well, an array of natural language preparation techniques were employed to determine the optimum preprocessing configuration to produce relevant results. We found that detecting potential student frustration or confusion was most successful using a Sequential Minimal Optimization algorithm (SMO), along with the Stanford Part-Of-Speech Tagger (POS Tagger), the iterated version of the Lovins stemmer, and a custom dictionary to help determine relevance probability. This model produced a promising initial F1 score of 0.79 and an accuracy of 0.83. Further, agreement values of 88% were achieved during inter-rater reliability testing between the classifier and human judges.
KW - E-Learning
KW - Machine Learning
KW - NLP
KW - Natural Language Processing
KW - Opinion Mining
KW - Sentiment Analysis
UR - http://www.scopus.com/inward/record.url?scp=85052508734&partnerID=8YFLogxK
U2 - 10.1109/ICALT.2018.00054
DO - 10.1109/ICALT.2018.00054
M3 - Published Conference contribution
AN - SCOPUS:85052508734
SN - 9781538660492
T3 - Proceedings - IEEE 18th International Conference on Advanced Learning Technologies, ICALT 2018
SP - 199
EP - 201
BT - Proceedings - IEEE 18th International Conference on Advanced Learning Technologies, ICALT 2018
A2 - Chen, Nian-Shing
A2 - Chang, Maiga
A2 - Huang, Ronghuai
A2 - Kinshuk, K.
A2 - Moudgalya, Kannan
A2 - Murthy, Sahana
A2 - Sampson, Demetrios G
T2 - 18th IEEE International Conference on Advanced Learning Technologies, ICALT 2018
Y2 - 9 July 2018 through 13 July 2018
ER -