Key Dates

Volume 11 Issue 3

Last Date of Paper Submission
April 09, 2025

Review Report (Faster Online Peer Review)
Within 3-4 Days after Submission

Publication (online)
Within 1-2 Days After Registration

Indexing and Certificate Delivery
After 7 Days of Last Date of Publication

PIF Impact Factor

Facebook Page

Youtube Channel

Blog

Ploska Bibliografia Naukowa

Automatic Extraction of Topics from Documents: Five Probabilistic Topic Model Tests

( Volume 2 Issue 11,November 2016 ) OPEN ACCESS

Author(s):

Sandra Jhean-Larose, Nicolas Leveau, Guy Denhiere , Ba-Linh Nguyen

Abstract:

In this paper, we test the capability of the Topic model to extract topics from documents (Griffiths &Steyvers, 2003, 2004; Griffiths, Steyvers&Tenenbaum, 2007). After presenting the mathematical aspects of the model and demonstrating its behavior on a small corpus, we attempt to falsify the model by manipulating (i) the size and similarities between the sub-corpora, (ii) the relative weight of sub-corpora,and (iii) the permeability to the scope and nature of contexts added to a fixed corpus. The model successfully passed our five tests, demonstrating that first, extracted topics were relevant and congruent to the content of the corpus, and second, that their probability appropriately reflected the relative weight of sub-corpora.

Paper Statistics:

Cite this Article:

Click here to get all Styles of Citation using DOI of the article.

International Journal of New Technology and Research

Impact Factor 3.953

Automatic Extraction of Topics from Documents: Five Probabilistic Topic Model Tests

Sandra Jhean-Larose, Nicolas Leveau, Guy Denhiere , Ba-Linh Nguyen