2020 WERWeAreandWERWeThinkWeAre

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Word Error Rate; Automatic Speech Recognition (ASR) System.

Notes

Cited By

Quotes

Abstract

Natural language processing of conversational speech requires the availability of high-quality transcripts. In this paper, we express our skepticism towards the recent reports of very low Word Error Rates (WERs) achieved by modern Automatic Speech Recognition (ASR) systems on benchmark datasets. We outline several problems with popular benchmarks and compare three state-of-the-art commercial ASR systems on an internal dataset of real-life spontaneous human conversations and HUB'™05 public benchmark. We show that WERs are significantly higher than the best reported results. We formulate a set of guidelines which may aid in the creation of real-life, multi-domain datasets with high quality annotations for training and testing of robust ASR systems.

References

BibTeX

@inproceedings{2020_WERWeAreandWERWeThinkWeAre,
  author    = {Piotr Szymanski and
               Piotr Zelasko and
               Mikolaj Morzy and
               Adrian Szymczak and
               Marzena Zyla-Hoppe and
               Joanna Banaszczak and
               Lukasz Augustyniak and
               Jan Mizgajski and
               Yishay Carmiel},
  editor    = {Trevor Cohn and
               Yulan He and
               Yang Liu},
  title     = {WER we are and WER we think we are},
  booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural
               Language Processing: Findings (EMNLP 2020) Online Event},
  series    = {Findings of ACL},
  volume    = {EMNLP 2020},
  pages     = {3290--3295},
  publisher = {Association for Computational Linguistics},
  year      = {2020},
  url       = {https://doi.org/10.18653/v1/2020.findings-emnlp.295},
  doi       = {10.18653/v1/2020.findings-emnlp.295},
}

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2020 WERWeAreandWERWeThinkWeArePiotr Szymanski
Piotr Zelasko
Mikolaj Morzy
Adrian Szymczak
Marzena Zyla-Hoppe
Joanna Banaszczak
Lukasz Augustyniak
Jan Mizgajski
Yishay Carmiel
WER We Are and WER We Think We Are