2020 LexPosFeatureBasedGrammarErrorD

(Agarwal et al., 2020) ⇒ Nancy Agarwal, Mudasir Ahmad Wani, and Patrick Bours. (2020). “Lex-Pos Feature-Based Grammar Error Detection System for the English Language.” In: Electronics, 9(10).

Subject Headings: Grammar Error Detection, Grammar Error Detection Algorithm, Grammar Error Detection System, Lex-Pos Sequence System.

Notes

Source Code available at: https://github.com/Machine-Learning-and-Data-Science/Lex-POS-Approach

Cited By

Google Scholar: ~ 2 Citations. 2021-02-25.

Quotes

Author Keywords

Natural Language Processing; Deep Learning; Grammar Error Detection; Word Embedding.

Abstract

This work focuses on designing a grammar detection system that understands both structural and contextual information of sentences for validating whether the English sentences are grammatically correct. Most existing systems model a grammar detector by translating the sentences into sequences of either words appearing in the sentences or syntactic tags holding the grammar knowledge of the sentences. In this paper, we show that both these sequencing approaches have limitations. The former model is over specific, whereas the latter model is over generalized, which in turn affects the performance of the grammar classifier. Therefore, the paper proposes a new sequencing approach that contains both information, linguistic as well as syntactic, of a sentence. We call this sequence a Lex-Pos sequence. The main objective of the paper is to demonstrate that the proposed Lex-Pos sequence has the potential to imbibe the specific nature of the linguistic words (i.e., lexicals) and generic structural characteristics of a sentence via Part-Of-Speech (POS) tags, and so, can lead to a significant improvement in detecting grammar errors. Furthermore, the paper proposes a new vector representation technique, Word Embedding One-Hot Encoding (WEOE) to transform this Lex-Pos into mathematical values. The paper also introduces a new error induction technique to artificially generate the POS tag specific incorrect sentences for training. The classifier is trained using two corpora of incorrect sentences, one with general errors and another with POS tag specific errors. Long Short-Term Memory (LSTM) neural network architecture has been employed to build the grammar classifier. The study conducts nine experiments to validate the strength of the Lex-Pos sequences. The Lex-Pos - based models are observed as superior in two ways: (1) they give more accurate predictions; and (2) they are more stable as lesser accuracy drops have been recorded from training to testing. To further prove the potential of the proposed Lex-Pos-based model, we compare it with some well known existing studies.

Introduction

Background Study

Lex-Pos Sequence

Datasets and Pre-Processing

Error Induction Methods

Feature Representation

Experiments and Results

Comparative Study

Discussion and Limitations

Conclusions and Future Scope

References

2018

(Chollampatt & Ng, 2018) ⇒ Shamil Chollampatt, and Hwee Tou Ng. (2018). “A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction.” In: Proceedings of the Thirty-Second Conference on Artificial Intelligence (AAAI-2018).

2017a

(Ji et al., 2017) ⇒ Ji, J.; Wang, Q.; Toutanova, K.; Gong, Y.; Truong, S.; Gao, J. A nested attention neural hybrid model for grammatical error correction. arXiv 2017, arXiv:1707.02026.

2017b

(Kaneko et al., 2017) ⇒ Kaneko, M.; Sakaizawa, Y.; Komachi, M. Grammatical error detection using error-and grammaticality-specific word embeddings. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing, Tapei, Taiwan, 27 November–1 December 2017; Volume 1: Long Papers, pp. 40–48.

2017c

(Liu & Liu, 2017) ⇒ Liu, Z.R.; Liu, Y. Exploiting unlabeled data for neural grammatical error detection. J. Comput. Sci. Technol. 2017, 32, 758–767.

2017d

(Tezcan et al., 2017) ⇒ Tezcan, A.; Hoste, V.; Macken, L. A neural network architecture for detecting grammatical errors in statistical machine translation. Prague Bull. Math. Linguist. 2017, 108, 133–145.

2016a

(Taghipour et al., 2016) ⇒ Taghipour, K.; Ng, H.T. A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference On Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 1882–1891.

2016b

(Yang et al., 2016) ⇒ Yang, Z.; Yang, D.; Dyer, C.; He, X.; Smola, A.; Hovy, E. Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of The North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 1480–1489.

2016c

(Yuan et al., 2016) ⇒ Yuan, Z.; Briscoe, T. Grammatical error correction using neural machine translation. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 380–386.

2015

(Sun et al., 2015) ⇒ Sun, C.; Jin, X.; Lin, L.; Zhao, Y.; Wang, X. Convolutional neural networks for correcting English article errors. In Natural Language Processing and Chinese Computing; Springer: Berlin/Heidelberg, Germany, 2015; pp. 102–110.

2014

(Kochmar et al., 2014) ⇒ Kochmar, E.; Briscoe, E. Detecting learner errors in the choice of content words using compositional distributional semantics. In: Proceedings of the Association for Computational Linguistics, Baltimore, MD, USA, 22–27 June 2014.

2010a

(Rozovskaya et al., 2010) ⇒ Rozovskaya, A.; Roth, D. Training paradigms for correcting errors in grammar and usage. In: Proceedings of the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2–4 June 2010; pp. 154–162.

2010b

(Tetreault et al., 2010) ⇒ Tetreault, J.; Foster, J.; Chodorow, M. Using parse features for preposition selection and error detection. In: Proceedings of the ACL 2010 Conference Short Papers. Association for Computational Linguistics, Uppsala, Sweden, 11–16 July 2010; pp. 353–358.

2010c

(Xiong et al., 2010) ⇒ Xiong, D.; Zhang, M.; Li, H. Error detection for statistical machine translation using linguistic features. In: Proceedings of the 48th annual meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11–16 July 2010; pp. 604–611.

2008

(Tetreault, 2008) ⇒ Tetreault, J.; Chodorow, M. The ups and downs of preposition error detection in ESL writing. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, 18–22 August 2008; pp. 865–872.

2007a

(Chodorow et al., 2007) ⇒ Chodorow, M.; Tetreault, J.; Han, N.R. Detection of grammatical errors involving prepositions. In: Proceedings of the Fourth ACL-SIGSEM Workshop on Prepositions, Prague, Czech Republic, 28 June 2007; pp. 25–30.

2007b

(Wagner et al., 2007) ⇒ Wagner, J.; Foster, J.; van Genabith, J. A comparative evaluation of deep and shallow approaches to the automatic detection of common grammatical errors. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, 28–30 June 2007; pp. 112–121.

2007c

(Wagner et al., 2007) ⇒ Wagner, J.; Foster, J.; van Genabith, J. Judging grammaticality: Experiments in sentence classification. Calico J. 2009, 26, 474–490.

BibTeX

@article{2020_LexPosFeatureBasedGrammarErrorD,
  author    = {Nancy Agarwal and
               Mudasir Ahmad Wani and
               Patrick Bours},
  title     = {Lex-Pos Feature-Based Grammar Error Detection System for the English Language},
  journal   = {Electronics},
  volume    = {9},
  year      = {2020},
  number    = {10--1686},
  url       = {https://www.mdpi.com/2079-9292/9/10/1686},
  doi       = {10.3390/electronics9101686},
  issn      = {2079-9292},
}

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2020 LexPosFeatureBasedGrammarErrorD	Nancy Agarwal Mudasir Ahmad Wani Patrick Bours			Lex-Pos Feature-Based Grammar Error Detection System for the English Language						2020