2006 DynamicTopicModels

Jump to: navigation, search

Subject Headings: Topic Modeling Algorithm, Topic Tracking Modeling Algorithm, Document Topic Evolution.


Cited by




A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics. Variational approximations based on Kalman filters and nonparametric wavelet regression are developed to carry out approximate posterior inference over the latent topics. In addition to giving quantitative, predictive models of a sequential corpus, dynamic topic models provide a qualitative window into the contents of a large document collection. The models are demonstrated by analyzing the OCR'ed archives of the journal Science from 1880 through (2000).


  • 1. Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society, Series B, 44(2):139--177.
  • 2. David M. Blei, Andrew Y. Ng, Michael I. Jordan, Latent dirichlet allocation, The Journal of Machine Learning Research, 3, p.993-1022, 3/1/2003 [doi>10.1162/jmlr.2003.3.4-5.993]
  • 3. Blei, D. M. and Lafferty, J. D. (2006). Correlated topic models. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18. MIT Press, Cambridge, MA.
  • 4. Wray Buntine, Aleks Jakulin, Applying discrete PCA in data analysis, Proceedings of the 20th conference on Uncertainty in artificial intelligence, p.59-66, July 07-11, 2004, Banff, Canada
  • 5. Erosheva, E. (2002). Grade of membership and latent structure models with application to disability survey data. PhD thesis, Carnegie Mellon University, Department of Statistics.
  • 6. Fei-Fei, L. and Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. IEEE Computer Vision and Pattern Recognition.
  • 7. Griffiths, T. and Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Science, 101:5228--5235.
  • 8. Kalman, R. (1960). A new approach to linear filtering and prediction problems. Transaction of the AMSE: Journal of Basic Engineering, 82:35--45.
  • 9. McCallum, A., Corrada-Emmanuel, A., and Wang, X. (2004). The author-recipient-topic model for topic and role discovery in social networks: Experiments with Enron and academic email. Technical report, University of Massachusetts, Amherst.
  • 10. Pritchard, J., Stephens, M., and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155:945--959.
  • 11. Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, Padhraic Smyth, The author-topic model for authors and documents, Proceedings of the 20th conference on Uncertainty in artificial intelligence, p.487-494, July 07-11, 2004, Banff, Canada
  • 12. Josef Sivic, Bryan C. Russell, Alexei A. Efros, Andrew Zisserman, William T. Freeman, Discovering Objects and their Localization in Images, Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, p.370-377, October 17-20, 2005 [doi>10.1109/ICCV.2005.77]
  • 13. Snelson, E. and Ghahramani, Z. (2006). Sparse Gaussian processes using pseudo-inputs. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18, Cambridge, MA. MIT Press.
  • 14. Wasserman, L. (2006). All of Nonparametric Statistics. Springer.
  • 15. Mike West, Jeff Harrison, Bayesian forecasting and dynamic models (2nd ed.), Springer-Verlag New York, Inc., New York, NY, 1997,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 DynamicTopicModelsDavid M. Blei
John D. Lafferty
Dynamic Topic ModelsICML 2006http://www.cs.princeton.edu/~blei/papers/BleiLafferty2006a.pdf10.1145/1143844.11438592006