2016 StillNotThereComparingTradition

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Encoder-Decoder Neural Network; Sequence-To-Sequence Neural Network; Spelling Error Correction, Monotone String Translation Task, Attention-Encoder-Decoder Neural Network, Morphological Inflection Encoder-Decoder Neural Network.

Notes

Cited By

Quotes

Abstract

We analyze the performance of encoder-decoder neural models and compare them with well-known established methods. The latter represent different classes of traditional approaches that are applied to the monotone sequence-to-sequence tasks OCR post-correction, spelling correction, grapheme-to-phoneme conversion, and lemmatization. Such tasks are of practical relevance for various higher-level research fields including digital humanities, automatic text correction, and speech recognition. We investigate how well generic deep-learning approaches adapt to these tasks, and how they perform in comparison with established and more specialized methods, including our own adaptation of pruned CRFs.

1 Introduction

2 Task Description

3 Data

4 Model Description

In this section, we briefly describe encoder-decoder neural models, pruned CRFs, and our three baselines.

4.1 Encoder-Decoder Neural Models

We compare three variants of encoder-decoder models: the ‘classic’ variant and two modifications:

Figure 1: In the encoder-decoder model, the encoder (bottom) generates a representation of the input sequence $\vec{x}$ from which the decoder (top) generates the output sequence $\vec{y}$. The attention-based mechanism (shown here) enables the decoder to “peek" into the input at every decoding step through multiple input representations $a_t$. Illustration from Bahdanau et al. (2014).

4.2 Pruned Conditional Random Fields

4.3 Further Baseline Systems

5 Results and Analysis

5.1 Model Performances

5.2 Training Time

6 Conclusions

Acknowledgements

References

  • 1. (Bahdanau et al., 2015) ⇒ Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. (2015). “Neural Machine Translation by Jointly Learning to Align and Translate.” In: Proceedings of the Third International Conference on Learning Representations, (ICLR-2015).
  • 2. Bisani, M., & Ney, H. (2008). Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication, 50(5), 434–451.
  • 3. Brill, E., & Moore, R. C. (2000). An improved error model for noisy channel spelling correction. In ACL ’00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics (pp. 286–293).
  • 4. Charniak, E., & Johnson, M. (2005). Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05) (pp. 173–180).
  • 5. Cho, K., Merrienboer, B. van, Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1724–1734).
  • 6. Chrupala, G. (2014). Normalizing tweets with edit scripts and recurrent neural embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 680–686).
  • 7. (Collobert et al., 2011b) ⇒ Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. (2011). “Natural Language Processing (Almost) from Scratch.” In: The Journal of Machine Learning Research, 12.
  • 8. Cucerzan, S., & Brill, E. (2004). Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users. In EMNLP (pp. 293–300).
  • 9. Eger, S. (2015). Designing and Comparing G2P-Type Lemmatizers for a Morphology-Rich Language. In International Workshop on Systems and Frameworks for Computational Morphology (pp. 27–40).
  • 10. Eger, S., Brück, T. vor der, & Mehler, A. (2016). A Comparison of Four Character-Level String-to-String Translation Models for (OCR) Spelling Error Correction. The Prague Bulletin of Mathematical Linguistics, 105(1), 77–99.
  • (Farra et al., 2014) ⇒ Noura Farra, Nadi Tomeh, Alla Rozovskaya, and Nizar Habash. 2014. Generalized Character-Level Spelling Error Correction. In: Proceedings of ACL ’14, pages 161–167, Baltimore, MD, USA. Association for Computational Linguistics.
  • (Faruqui et al., 2016) ⇒ Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, and Chris Dyer. (2016). “Morphological Inflection Generation Using Character Sequence to Sequence Learning.” In: Proceedings of the 2016 Conference of the North {{American Chapter of the Association for Computational Linguistics: Human Language Technologies. DOI:10.18653/v1/N16-1077
  • (Gu et al., 2016) ⇒ Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. (2016). “Incorporating Copying Mechanism in Sequence-to-Sequence Learning.” In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). doi:10.18653/v1/P16-1154
  • 11. Gubanov, S., Galinskaya, I., & Baytin, A. (2014). Improved Iterative Correction for Distant Spelling Errors. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 168–173).
  • 12. Jiampojamarn, S., Cherry, C., & Kondrak, G. (2010). Integrating Joint n-gram Features into a Discriminative Training Framework. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (pp. 697–700).
  • 13. Kominek, J., & Black, A. W. (2004). The CMU Arctic speech databases. SSW, 223–224.
  • 14. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In ICML ’01 Proceedings of the Eighteenth International Conference on Machine Learning (pp. 282–289).
  • 15. Lewellen, M. (1998). Neural Network Recognition of Spelling Errors. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2 (pp. 1490–1492).
  • 16. Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. ArXiv Preprint ArXiv:1508.04025.
  • 17. Mueller, T., Schmid, H., & Schütze, H. (2013). Efficient Higher-Order CRFs for Morphological Tagging. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (pp. 322–332).
  • 18. Okazaki, N., Tsuruoka, Y., Ananiadou, S., & Tsujii, J. ’ichi. (2008). A Discriminative Candidate Generator for String Transformations. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (pp. 447–456).
  • 19. Raaijmakers, S. (2013). A Deep Graphical Model for Spelling Correction. BNAIC 2013: Proceedings of the 25th Benelux Conference on Artificial Intelligence, Delft, The Netherlands, November 7-8, 2013.
  • 20. Rao, K., Peng, F., Sak, H., & Beaufays, F. (2015). Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4225–4229).
  • 21. Reynaert, M. (2014). On OCR ground truths and OCR post-correction gold standards, tools and formats. In: Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage (pp. 159–166).
  • 22. Richmond, K., Clark, R. A. J., & Fitt, S. (2009). Robust LTS rules with the Combilex speech technology lexicon. In INTERSPEECH (pp. 1295–1298).
  • 23. Schmaltz, A. R., Kim, Y., Rush, A. M., & Shieber, S. M. (2016). Sentence-level grammatical error identification as sequence-to-sequence correction. In: Proceedings of the Eleventh Workshop on Innovative Use of NLP for Building Educational Applications (pp. 242–251).
  • 24. Sherif, T., & Kondrak, G. (2007). Substring-Based Transliteration. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (pp. 944–951).
  • 25. (Sutskever et al., 2014) ⇒ Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. (2014). “Sequence to Sequence Learning with Neural Networks.” In: Advances in Neural Information Processing Systems. arXiv:1409.321 .
  • 27. Vinyals, O., & Le, Q. V. (2015). A Neural Conversational Model. ArXiv Preprint ArXiv:1506.05869.
  • 28. Vukotić, V., Raymond, C., & Gravier, G. (2015). Is it time to switch to Word Embedding and Recurrent Neural Networks for Spoken Language Understanding. In InterSpeech (pp. 130–134).
  • 29. Wang, Z., Xu, G., Li, H., & Zhang, M. (2014). A Probabilistic Approach to String Transformation. IEEE Transactions on Knowledge and Data Engineering, 26(5), 1063–1075.
  • 30. Xie, Z., Avati, A., Arivazhagan, N., Jurafsky, D., & Ng, A. Y. (2016). Neural Language Correction with Character-Based Attention. ArXiv Preprint ArXiv:1603.09727.
  • 31. Yao, K., & Zweig, G. (2015). Sequence-to-sequence neural net models for grapheme-to-phoneme conversion. In INTERSPEECH (pp. 3330–3334).
  • 32. Yin, W., Ebert, S., & Schütze, H. (2016). Attention-Based Convolutional Neural Network for Machine Comprehension. In: Proceedings of the Workshop on Human-Computer Question Answering (pp. 15–21). ----;


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2016 StillNotThereComparingTraditionChris Dyer
Iryna Gurevych
Hang Li
Pavel Kuksa
Ronan Collobert
Koray Kavukcuoglu
Jason Weston
Léon Bottou
Yoshua Bengio
Michael Karlen
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
Zhengdong Lu
Kyunghyun Cho
Dzmitry Bahdanau
Graham Neubig
Carsten Schnober
Steffen Eger
Erik-Lan Do Dinh
Manaal Faruqui
Yulia Tsvetkov
Jiatao Gu
Victor O.K. Li
Still Not There? Comparing Traditional Sequence-to-Sequence Models to Encoder-Decoder Neural Networks on Monotone String Translation Tasks2011
2014
2015
2016