1999 StatisticalModelsForTextSegmentation

(Beeferman et al., 1999) ⇒ Doug Beeferman, Adam Berger, and John D. Lafferty. (1999). “Statistical Models for Text Segmentation.” In: Machine Learning, 34(1–3).

Subject Headings: Text Segmentation Algorithm, Statistical Algorithm.

Notes

Cited By

~378 http://scholar.google.com/scholar?cites=4536413215121505859

2000

(McCallum et al., 2000) ⇒ Andrew McCallum, Dayne Freitag, and Fernando Pereira. (2000). “Maximum Entropy Markov Models for Information Extraction and Segmentation.” In: Proceedings of ICML-2000.

Quotes

Abstract

This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, which may be domain-specific, that tend to be used near segment boundaries. Assessment of our approach on quantitative and qualitative grounds demonstrates its effectiveness in two very different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, which combines precision and recall in a natural and flexible way. This metric is used to make a quantitative assessment of the relative contributions of the different feature types, as well as a comparison with decision trees and previously proposed text segmentation algorithms.

References

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1999 StatisticalModelsForTextSegmentation	John D. Lafferty Doug Beeferman Adam Berger			Statistical Models for Text Segmentation		Machine Learning (ML) Subject Area	http://www-2.cs.cmu.edu/~aberger/pdf/ml.pdf			1999