2015 GraphemetoPhonemeConversionUsin

From GM-RKB

Jump to navigation Jump to search

(Rao et al., 2015) ⇒ Kanishka Rao, Fuchun Peng, Haşim Sak, and Françoise Beaufays. (2015). “Grapheme-to-phoneme Conversion Using Long Short-term Memory Recurrent Neural Networks.” In: Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. doi:10.1109/ICASSP.2015.7178767

Subject Headings: Grapheme-to-Phoneme (G2P)

Notes

Cited By

http://scholar.google.com/scholar?q=%222015%22+Grapheme-to-phoneme+Conversion+Using+Long+Short-term+Memory+Recurrent+Neural+Networks

Quotes

Abstract

Grapheme-to-phoneme (G2P) models are key components in speech recognition and text-to-speech systems as they describe how words are pronounced. We propose a G2P model based on a Long Short-Term Memory (LSTM) recurrent neural network (RNN). In contrast to traditional joint-sequence based G2P approaches, LSTMs have the flexibility of taking into consideration the full context of graphemes and transform the problem from a series of grapheme-to-phoneme conversions to a word-to-pronunciation conversion. Training joint-sequence based G2P require explicit grapheme-to-phoneme alignments which are not straightforward since graphemes and phonemes don't correspond one-to-one. The LSTM based approach forgoes the need for such explicit alignments. We experiment with unidirectional LSTM (ULSTM) with different kinds of output delays and deep bidirectional LSTM (DBLSTM) with a connectionist temporal classification (CTC) layer. The DBLSTM-CTC model achieves a word error rate (WER) of 25.8% on the public CMU dataset for US English. Combining the DBLSTM-CTC model with a joint n-gram model results in a WER of 21.3%, which is a 9% relative improvement compared to the previous best WER of 23.4% from a hybrid system.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2015 GraphemetoPhonemeConversionUsin	Fuchun Peng Kanishka Rao Haşim Sak Françoise Beaufays			Grapheme-to-phoneme Conversion Using Long Short-term Memory Recurrent Neural Networks				10.1109/ICASSP.2015.7178767		2015

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2015_GraphemetoPhonemeConversionUsin&oldid=851148"

Facts

... more about "2015 GraphemetoPhonemeConversionUsin"

Kanishka Rao +, Fuchun Peng +, Haşim Sak + and Françoise Beaufays +

10.1109/ICASSP.2015.7178767 +

Grapheme-to-phoneme Conversion Using Long Short-term Memory Recurrent Neural Networks +

2015 +