2010 DiscriminativeTopicModelingbase

(Huh et al., 2010) ⇒ Seungil Huh, and Stephen E. Fienberg. (2010). “Discriminative Topic Modeling based on Manifold Learning.” In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). doi:10.1145/1835804.1835888

Subject Headings: Discriminative Topic Modeling, Manifold Learning.

Notes

Cited By

Quotes

Author Keywords

Topic modeling, Dimensionality reduction, Document classification, Semi-supervised learning.

Abstract

Topic modeling has been popularly used for data analysis in various domains including text documents. Previous topic models, such as probabilistic Latent Semantic Analysis (pLSA) and Latent Dirichlet Allocation (LDA), have shown impressive success in discovering low-rank hidden structures for modeling text documents. These models, however, do not take into account the manifold structure of data, which is generally informative for the non-linear dimensionality reduction mapping. More recent models, namely Laplacian PLSI (LapPLSI) and Locally-consistent Topic Model (LTM), have incorporated the local manifold structure into topic models and have shown the resulting benefits. But these approaches fall short of the full discriminating power of manifold learning as they only enhance the proximity between the low-rank representations of neighboring pairs without any consideration for non-neighboring pairs. In this paper, we propose Discriminative Topic Model (DTM) that separates non-neighboring pairs from each other in addition to bringing neighboring pairs closer together, thereby preserving the global manifold structure as well as improving the local consistency. We also present a novel model fitting algorithm based on the generalized EM and the concept of Pareto improvement. As a result, DTM achieves higher classification performance in a semi-supervised setting by effectively exposing the manifold structure of data. We provide empirical evidence on text corpora to demonstrate the success of DTM in terms of classification accuracy and robustness to parameters compared to state-of-the-art techniques.

Introduction

…

Traditional manifold learning algorithms [17, 14, 2] have given way to recent graph-based semi-supervised learning algorithms [19, 18, 3]. The goal of manifold learning is to recover the structure of a given dataset by non-linear mapping into a low-dimensional space. As a manifold learning algorithm, Laplacian Eigenmaps [2] was developed based on spectral graph theory [8].

…

References

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2010 DiscriminativeTopicModelingbase	Stephen E. Fienberg Seungil Huh			Discriminative Topic Modeling based on Manifold Learning		KDD-2010 Proceedings		10.1145/1835804.1835888		2010

2010 DiscriminativeTopicModelingbase

Notes

Cited By

Quotes

Author Keywords

Abstract

Introduction

References

Navigation menu

Search