TY - GEN
T1 - MuLAD
T2 - 27th International Conference on Pattern Recognition, ICPR 2024
AU - Hasan, Md Maruf
AU - Ahsan, Shawly
AU - Hoque, Mohammed Moshiul
AU - Dewan, M. Ali Akber
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Aggression detection from memes is challenging due to their region-specific interpretation and multimodal nature. Detecting or classifying aggressive memes is complicated in low-resource languages (including Bengali) because benchmark datasets and primary language processing software are needed. This paper proposes an innovative meme classification technique that harnesses deep learning (DL) approaches to leverage memes’ visual and textual features in Bengali. Various DL frameworks, such as VGG16, VGG19, ResNet50, CNN, BiLSTM, and BiLSTM+CNN, extract visual and textual features from memes. A novel corpus named the Bengali Meme Dataset (AMemD) is also introduced, comprising a substantial amount of multimodal data, including text and image components. Experimental results on AMemD demonstrate the effectiveness of the proposed approach. The CNN combined with VGG16 obtained the highest f1-score of 0.738 among all multimodal techniques tested. This pioneering research offers valuable insights into the complex task of aggression detection from memes in Bengali and provides a foundation for future studies in this area. The dataset is available at https://github.com/Maruf089/Multimodal-Aggression-Detection.
AB - Aggression detection from memes is challenging due to their region-specific interpretation and multimodal nature. Detecting or classifying aggressive memes is complicated in low-resource languages (including Bengali) because benchmark datasets and primary language processing software are needed. This paper proposes an innovative meme classification technique that harnesses deep learning (DL) approaches to leverage memes’ visual and textual features in Bengali. Various DL frameworks, such as VGG16, VGG19, ResNet50, CNN, BiLSTM, and BiLSTM+CNN, extract visual and textual features from memes. A novel corpus named the Bengali Meme Dataset (AMemD) is also introduced, comprising a substantial amount of multimodal data, including text and image components. Experimental results on AMemD demonstrate the effectiveness of the proposed approach. The CNN combined with VGG16 obtained the highest f1-score of 0.738 among all multimodal techniques tested. This pioneering research offers valuable insights into the complex task of aggression detection from memes in Bengali and provides a foundation for future studies in this area. The dataset is available at https://github.com/Maruf089/Multimodal-Aggression-Detection.
KW - Aggressive memes
KW - Deep learning
KW - Meme classification
KW - Multimodal fusion
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85212290594&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-78119-3_8
DO - 10.1007/978-3-031-78119-3_8
M3 - Published Conference contribution
AN - SCOPUS:85212290594
SN - 9783031781186
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 107
EP - 123
BT - Pattern Recognition - 27th International Conference, ICPR 2024, Proceedings
A2 - Antonacopoulos, Apostolos
A2 - Chaudhuri, Subhasis
A2 - Chellappa, Rama
A2 - Liu, Cheng-Lin
A2 - Bhattacharya, Saumik
A2 - Pal, Umapada
Y2 - 1 December 2024 through 5 December 2024
ER -