A novel contextual topic model for multi-document summarization

Guangbing Yang, Dunwei Wen, Kinshuk, Nian Shing Chen, Erkki Sutinen

    Research output: Contribution to journalJournal Articlepeer-review

    63 Citations (Scopus)

    Abstract

    Information overload becomes a serious problem in the digital age. It negatively impacts understanding of useful information. How to alleviate this problem is the main concern of research on natural language processing, especially multi-document summarization. With the aim of seeking a new method to help justify the importance of similar sentences in multi-document summarizations, this study proposes a novel approach based on recent hierarchical Bayesian topic models. The proposed model incorporates the concepts of n-grams into hierarchically latent topics to capture the word dependencies that appear in the local context of a word. The quantitative and qualitative evaluation results show that this model has outperformed both hLDA and LDA in document modeling. In addition, the experimental results in practice demonstrate that our summarization system implementing this model can significantly improve the performance and make it comparable to the state-of-the-art summarization systems.

    Original languageEnglish
    Pages (from-to)1340-1352
    Number of pages13
    JournalExpert Systems with Applications
    Volume42
    Issue number3
    DOIs
    Publication statusPublished - 15 Feb. 2015

    Keywords

    • Contextual topic
    • Hierarchical topic model
    • Multi-document summarization

    Fingerprint

    Dive into the research topics of 'A novel contextual topic model for multi-document summarization'. Together they form a unique fingerprint.

    Cite this