LSTM-based Language Model (LM) Training Algorithm: Difference between revisions

Latest revision as of 12:24, 2 August 2022

(Bowman et al., 2015) ⇒ Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. (2015). “Generating Sentences from a Continuous Space.” arXiv preprint arXiv:1511.06349

@@ Line 13: / Line 13: @@
 === 2017b ===
 * ([[2017_ImprovedVariationalAutoencoders|Yang, Hu et al., 2017]]) ⇒ [[Zichao Yang]], [[Zhiting Hu]], [[Ruslan Salakhutdinov]], and [[Taylor Berg-Kirkpatrick]]. ([[2017]]). “[https://arxiv.org/pdf/1702.08139 Improved Variational Autoencoders for Text Modeling Using Dilated Convolutions].&rdquo; In: Proceedings of the 34th International Conference on Machine Learning ([[ICML-2017]]).
 ** QUOTE: Recent [[NLP research|work]] on [[generative modeling of text]] has found that [[variational auto-encoders (VAE)]] incorporating [[LSTM decoder]]s perform [[worse]] than [[simpler]] [[LSTM-based Language Modeling (LM) Algorithm|LSTM language model]]s ([[Bowman et al., 2015]]). </s> This [[negative result]] is so [[far poorly understood]], but has been attributed to the propensity of [[LSTM decoder]]s to ignore [[conditioning information]] from the [[encoder]]. </s>  …