Spectral Learning Algorithm: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
m (Text replacement - "<P> [[" to "<P>  [[")
m (Text replacement - "ments]]" to "ment]]s")
Line 9: Line 9:
=== 2013 ===
=== 2013 ===
* ([[Cohen, Collins et al., 2013]]) ⇒ [[Shay Cohen]], [[Michael Collins]], [[Dean Foster]], [[Karl Stratos]] and [[Lyle Ungar]]. ([[2013]]). Spectral Learning Algorithms for Natural Language Processing." Tutorial at NAACL 2013
* ([[Cohen, Collins et al., 2013]]) ⇒ [[Shay Cohen]], [[Michael Collins]], [[Dean Foster]], [[Karl Stratos]] and [[Lyle Ungar]]. ([[2013]]). Spectral Learning Algorithms for Natural Language Processing." Tutorial at NAACL 2013
** QUOTE: Recent [[work in machine learning]] and [[recent NLP research|NLP]] has developed [[Spectral Learning Algorithm|spectral algorithm]]s for many [[learning tasks involving latent variables]]. </s> [[Spectral Learning Algorithm|Spectral algorithm]]s rely on [[singular value decomposition]] as a [[basic operation]], usually followed by some [[simple estimation method]] based on the [[method of moments]]. </s> From a [[theoretical point of view]], these [[Spectral Learning Algorithm|methods]] are appealing in that they offer [[consistent estimator]]s (and [[PAC-style guarantees of sample complexity]]) for several important [[latent-variable model]]s. </s> This is in contrast to the [[EM algorithm]], which is an extremely successful approach, but which only has guarantees of reaching a local maximum of the [[likelihood function]]. </s>        <P>        From a practical point of view, the methods (unlike [[EM]]) have no need for careful [[initialization]], and have recently been shown to be highly efficient (as one example, in work under submission by the authors on learning of [[latent-variable PCFG]]s, a [[Spectral Learning Algorithm|spectral algorithm]] performs at [[identical accuracy to EM]], but is around 20 times faster). </s>        <P>        In [[this tutorial]] we will aim to give a broad overview of [[Spectral Learning Algorithm|spectral method]]s, describing theoretical guarantees, as well as practical issues. </s> [[We]] will start by covering the basics of [[singular value decomposition]] and describe efficient methods for doing [[singular value decomposition]]. </s> The [[SVD operation]] is at the core of most [[Spectral Learning Algorithm|spectral algorithm]]s that have been developed. </s>        <P>         [[We]] will then continue to cover [[canonical correlation analysis (CCA)]]. </s> [[CCA]] is an early method from [[statistics for dimensionality reduction]]. </s> With [[CCA]], two or more views of the data are created, and they are all projected into a [[lower dimensional space]] which [[maximize]]s the correlation between the views. </s> [[We]] will review the basic algorithms underlying [[CCA]], give some formal results giving guarantees for [[latent-variable model]]s and also describe how they have been applied recently to learning lexical representations from [[large quantities of unlabeled data]]. </s> This idea of [[learning lexical representation]]s can be extended further, where [[unlabeled data]] is used to learn underlying representations which are subsequently used as additional information for [[supervised training]]. </s>        <P>         [[We]] will also cover how [[Spectral Learning Algorithm|spectral algorithm]]s can be used for [[structured prediction problem]]s with [[sequence-based prediction task|sequence]]s and [[parse tree-based prediction task|parse tree]]s. </s> A striking recent result by [[Hsu, Kakade and Zhang (2009)]] shows that [[HMMs]] can be [[learned efficiently]] using a [[Spectral Learning Algorithm|spectral algorithm]]. </s> [[HMMs]] are widely used in [[NLP]] and [[speech task|speech]], and [[previous algorithm]]s (typically based on [[EM]]) were guaranteed to only reach a [[local maximum of the likelihood function]], so this is a crucial result. </s> [[We]] will review the basic mechanics of the [[HMM learning algorithm]], describe its formal guarantees, and also cover practical issues. </s>        <P>        Last, [[we]] will cover work about [[Spectral Learning Algorithm|spectral algorithm]]s in the context of [[natural language parsing]]. </s> [[We]] will show how [[Spectral Learning Algorithm|spectral algorithm]]s can be used to [[estimate the parameter models of latent-variable PCFGs]], a model which serves as the base for [[state-of-the-art parsing model]]s such as the one of [[Petrov et al. (2007)]]. </s> [[We]] will show what are the practical steps that are needed to be taken in order to make [[Spectral Learning Algorithm|spectral algorithm]]s for [[L-PCFGs]] (or other models in general) practical and comparable to [[state of the art]]. </s>
** QUOTE: Recent [[work in machine learning]] and [[recent NLP research|NLP]] has developed [[Spectral Learning Algorithm|spectral algorithm]]s for many [[learning tasks involving latent variables]]. </s> [[Spectral Learning Algorithm|Spectral algorithm]]s rely on [[singular value decomposition]] as a [[basic operation]], usually followed by some [[simple estimation method]] based on the [[method of moment]]s. </s> From a [[theoretical point of view]], these [[Spectral Learning Algorithm|methods]] are appealing in that they offer [[consistent estimator]]s (and [[PAC-style guarantees of sample complexity]]) for several important [[latent-variable model]]s. </s> This is in contrast to the [[EM algorithm]], which is an extremely successful approach, but which only has guarantees of reaching a local maximum of the [[likelihood function]]. </s>        <P>        From a practical point of view, the methods (unlike [[EM]]) have no need for careful [[initialization]], and have recently been shown to be highly efficient (as one example, in work under submission by the authors on learning of [[latent-variable PCFG]]s, a [[Spectral Learning Algorithm|spectral algorithm]] performs at [[identical accuracy to EM]], but is around 20 times faster). </s>        <P>        In [[this tutorial]] we will aim to give a broad overview of [[Spectral Learning Algorithm|spectral method]]s, describing theoretical guarantees, as well as practical issues. </s> [[We]] will start by covering the basics of [[singular value decomposition]] and describe efficient methods for doing [[singular value decomposition]]. </s> The [[SVD operation]] is at the core of most [[Spectral Learning Algorithm|spectral algorithm]]s that have been developed. </s>        <P>         [[We]] will then continue to cover [[canonical correlation analysis (CCA)]]. </s> [[CCA]] is an early method from [[statistics for dimensionality reduction]]. </s> With [[CCA]], two or more views of the data are created, and they are all projected into a [[lower dimensional space]] which [[maximize]]s the correlation between the views. </s> [[We]] will review the basic algorithms underlying [[CCA]], give some formal results giving guarantees for [[latent-variable model]]s and also describe how they have been applied recently to learning lexical representations from [[large quantities of unlabeled data]]. </s> This idea of [[learning lexical representation]]s can be extended further, where [[unlabeled data]] is used to learn underlying representations which are subsequently used as additional information for [[supervised training]]. </s>        <P>         [[We]] will also cover how [[Spectral Learning Algorithm|spectral algorithm]]s can be used for [[structured prediction problem]]s with [[sequence-based prediction task|sequence]]s and [[parse tree-based prediction task|parse tree]]s. </s> A striking recent result by [[Hsu, Kakade and Zhang (2009)]] shows that [[HMMs]] can be [[learned efficiently]] using a [[Spectral Learning Algorithm|spectral algorithm]]. </s> [[HMMs]] are widely used in [[NLP]] and [[speech task|speech]], and [[previous algorithm]]s (typically based on [[EM]]) were guaranteed to only reach a [[local maximum of the likelihood function]], so this is a crucial result. </s> [[We]] will review the basic mechanics of the [[HMM learning algorithm]], describe its formal guarantees, and also cover practical issues. </s>        <P>        Last, [[we]] will cover work about [[Spectral Learning Algorithm|spectral algorithm]]s in the context of [[natural language parsing]]. </s> [[We]] will show how [[Spectral Learning Algorithm|spectral algorithm]]s can be used to [[estimate the parameter models of latent-variable PCFGs]], a model which serves as the base for [[state-of-the-art parsing model]]s such as the one of [[Petrov et al. (2007)]]. </s> [[We]] will show what are the practical steps that are needed to be taken in order to make [[Spectral Learning Algorithm|spectral algorithm]]s for [[L-PCFGs]] (or other models in general) practical and comparable to [[state of the art]]. </s>


=== 2003 ===
=== 2003 ===

Revision as of 04:37, 24 June 2024

A Spectral Learning Algorithms is a learning algorithm that ...

  • See: [[]].


References

2013

2003