1992 ClassBasedNGramModelsOfNL

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Word N-gram Model, Brown Word-Hierarchy Cluster, Brown et al Clustering Algorithm.

Notes

Cited By

Quotes

Abstract

We address the problem of predicting a word from previous words in a sample of text. In particular, we discuss n-gram models based on classes of words. We also discuss several statistical algorithms for assigning words to classes based on the frequency of their co-occurrence with other words. We find that we are able to extract classes that have the flavor of either syntactically based groupings or semantically based groupings, depending on the nature of the underlying statistics.

References

  • 1. Averbuch, A.; Bahl, L.; Bakis, R.; Brown, P.; Cole, A.; Daggett, G.; Das, S.; Davies, K.; Gennaro, S. De.; de Souza, P.; Epstein, E.; Fraleigh, D.; Jelinek, F.; Moorhead, J.; Lewis, B.; Mercer, R.; Nadas, A.; Nahamoo, D.; Picheny, M.; Shichman, G.; Spinelli, P.; Van Compernolle, D.; and Wilkens, H. (1987). “Experiments with the Tangora 20,000 word speech recognizer.” In: Proceedings, IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas, Texas, 701--704.
  • 3. Baum, L. (1972). “An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process ." Inequalities, 3, 1--8.
  • 4. Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, Paul S. Roossin, A statistical approach to machine translation, Computational Linguistics, v.16 n.2, p.79-85, June 1990
  • 5. Arthur P. Dempster; Laird, N.; and Rubin, D. (1977). “Maximum likelihood from incomplete data via the EM algorithm.” In: Journal of the Royal Statistical Society, 39(B), 1--38.
  • 6. Feller, W. (1950). An Introduction to Probability Theory and its Applications, Volume I. John Wiley & Sons, Inc.
  • 7. Robert G. Gallager, Information Theory and Reliable Communication, John Wiley & Sons, Inc., New York, NY, 1968
  • 8. Good, I. (1953). “The population frequencies of species and the estimation of population parameters." Biometrika, 40(3--4), 237--264.
  • 9. Jelinek, F., and Mercer, R. L. (1980). “Interpolated estimation of Markov source parameters from sparse data.” In: Proceedings, Workshop on Pattern Recognition in Practice, Amsterdam, The Netherlands, 381--397.
  • 10. Kuçera, H., and Francis, W. (1967). Computational Analysis of Present Day American English. Brown University Press.
  • 11. Mays, E.; Damerau, F. J.; and Mercer, R. L. (1990). “Context-based spelling correction.” In: Proceedings, IBM Natural Language ITL. Paris, France, 517--522.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1992 ClassBasedNGramModelsOfNLPeter F. Brown
Peter V. deSouza
Robert L. Mercer
Vincent J. Della Pietra
Jenifer C. Lai
Class-based N-gram Models of Natural LanguageComputational Linguistics (CL) Research Areahttp://acl.ldc.upenn.edu/J/J92/J92-4003.pdf1992