Abstract
The COVID-19 pandemic has led to an unprecedented challenge to public health. It resulted in global efforts to understand, record, and alleviate the disease. This research serves the purpose of generating a relevant summary related to Coronavirus. The research uses the COVID-19 Open Research Dataset (CORD-19) provided by Allen Institute for AI. The dataset contains 236,336 academic full-text articles as of July 19, 2021. This paper introduces a web-based system to handle user questions over the Coronavirus full-text scholarly articles. The system periodically runs backend services to process such large amount article with basic Natural Language Processing (NLP) techniques that include tokenization, N-Grams extraction, and part-of-speech (PoS) tagging. It automatically identifies the keywords from the question and uses cosine similarity to summarize the associated content and present to the user. This research will possibly benefit researchers, health workers as well as other individuals. Moreover, the same service can be used to train with the datasets of different domains (e.g., education) to generate a relevant summary for other user groups (e.g., students).
| Original language | English |
|---|---|
| Title of host publication | Intelligent Systems Design and Applications - 21st International Conference on Intelligent Systems Design and Applications, ISDA 2021 |
| Editors | Ajith Abraham, Niketa Gandhi, Thomas Hanne, Tzung-Pei Hong, Tatiane Nogueira Rios, Weiping Ding |
| Pages | 508-517 |
| Number of pages | 10 |
| DOIs | |
| Publication status | Published - 2022 |
| Event | 21st International Conference on Intelligent Systems Design and Applications, ISDA 2021 - Virtual, Online Duration: 13 Dec. 2021 → 15 Dec. 2021 |
Publication series
| Name | Lecture Notes in Networks and Systems |
|---|---|
| Volume | 418 LNNS |
| ISSN (Print) | 2367-3370 |
| ISSN (Electronic) | 2367-3389 |
Conference
| Conference | 21st International Conference on Intelligent Systems Design and Applications, ISDA 2021 |
|---|---|
| City | Virtual, Online |
| Period | 13/12/21 → 15/12/21 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Coronavirus
- Information extraction
- N-grams
- Parts of speech
- Question and answering
Fingerprint
Dive into the research topics of 'Summary Generation Using Natural Language Processing Techniques and Cosine Similarity'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver