Natural Language Generation (NLG) Task
A Natural Language Generation (NLG) Task is a language generation task that is an automated natural language processing task (that produces natural language expressions).
- Context:
- output: NLG Output (with NL expressions).
- measures: NLG Performance Measures, such as Syntactic Correctness and Intelligibility.
- It can range from (typically) being a Human-Performed Language Generation Task to being an Automated Language Generation Task (that can be solved by an NLG System that implements an NLG algorithm).
- It can range from being a Domain-Specific NLG Task to being a Open-Domain NLG Task.
- It can range from being a Speech Generation Task to being a Written Language Generation Task (such as text generation).
- It can range from being Work Generation, Phare Generation, Sentence Generation, Passage Generation, Document Generation, ...
- It can be a Freeform NLG Task (such as chit chat) to being a Topic-based NLG Task.
- It can be preceded by or include a Natural Language Understanding Task.
- …
- Example(s):
- a Question Answering Generation Task.
- a Text Summarization Task, such as news summarization.
- a Text Generation Task.
- an Annotated Document Generation Task, such as: WikiText Generation,
- a Data-to-Text Generation Task such as:
- an AMR-to-Text Generation Task.
- a Natural Language Translation Task,
- a Neural Natural Language Processing Task.
- …
- Counter-Example(s):
- See: Linguistic Component, Text Editing, Natural Language Processing Task.
References
2021
- (Wikipedia, 2021) ⇒ https://en.wikipedia.org/wiki/Natural-language_generation Retrieved:2021-2-20.
- Natural-language generation (NLG) is a software process that transforms structured data into natural language. It can be used to produce long form content for organizations to automate custom reports, as well as produce custom content for a web or mobile application. It can also be used to generate short blurbs of text in interactive conversations (a chatbot) which might even be read out by a text-to-speech system.
Automated NLG can be compared to the process humans use when they turn ideas into writing or speech. Psycholinguists prefer the term language production for this process, which can also be described in mathematical terms, or modeled in a computer for psychological research. NLG systems can also be compared to translators of artificial computer languages, such as decompilers or transpilers, which also produce human-readable code generated from an intermediate representation. Human languages tend to be considerably more complex and allow for much more ambiguity and variety of expression than programming languages, which makes NLG more challenging.
NLG may be viewed as the opposite of natural-language understanding (NLU): whereas in natural-language understanding, the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to make decisions about how to put a concept into words. The practical considerations in building NLU vs. NLG systems are not symmetrical. NLU needs to deal with ambiguous or erroneous user input, whereas the ideas the system wants to express through NLG are generally known precisely. NLG needs to choose a specific, self-consistent textual representation from many potential representations, whereas NLU generally tries to produce a single, normalized representation of the idea expressed.[1]
NLG has existed since ELIZA was developed in the mid 1960s, but commercial NLG technology has only recentlybecome widely available. NLG techniques range from simple template-based systems like a mail merge that generates form letters, to systems that have a complex understanding of human grammar. NLG can also be accomplished by training a statistical model using machine learning, typically on a large corpus of human-written texts.
- Natural-language generation (NLG) is a software process that transforms structured data into natural language. It can be used to produce long form content for organizations to automate custom reports, as well as produce custom content for a web or mobile application. It can also be used to generate short blurbs of text in interactive conversations (a chatbot) which might even be read out by a text-to-speech system.
- ↑ Dale, Robert; Reiter, Ehud (2000). Building natural language generation systems. Cambridge, U.K.: Cambridge University Press. ISBN 978-0-521-02451-8.
2018a
- (Clark et al., 2018) ⇒ Elizabeth Clark, Yangfeng Ji, and Noah A. Smith. (2018). “Neural Text Generation in Stories Using Entity Representations As Context.” In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), Volume 1 (Long Papers). DOI:10.18653/v1/N18-1204.
2018b
- (Fedus et al., 2018) ⇒ William Fedus, Ian Goodfellow, and Andrew M Dai. (2018). "MaskGAN: Better Text Generation via Filling in the ________". In: Proceedings of the Sixth International Conference on Learning Representations (ICLR-2018).
2018c
- (Guo et al., 2018) ⇒ Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, and Jun Wang. (2018). “Long Text Generation via Adversarial Training with Leaked Information.” In: Proceedings of the Thirty-Second (AAAI) Conference on Artificial Intelligence (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th (AAAI) Symposium on Educational Advances in Artificial Intelligence (EAAI-18).
2018d
- (Kudo & Richardson, 2018) ⇒ Taku Kudo, and John Richardson. (2018). “SentencePiece: A Simple and Language Independent Subword Tokenizer and Detokenizer for Neural Text Processing.” In: arXiv preprint arXiv:1808.06226.
2018e
- (Lee et al., 2018) ⇒ Chris van der Lee, Emiel Krahmer, and Sander Wubben. (2018). “Automated Learning of Templates for Data-to-text Generation: Comparing Rule-based, Statistical and Neural Methods.” In: Proceedings of the 11th International Conference on Natural Language Generation (INLG 2018). DOI:http://dx.doi.org/10.18653/v1/W18-6504
2018f
- (Song et al., 2018) ⇒ Linfeng Song, Yue Zhang, Zhiguo Wang, and Daniel Gildea. (2018). “A Graph-to-Sequence Model for AMR-to-Text Generation.” In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018) Volume 1: Long Papers. DOI:10.18653/v1/P18-1150
2018g
- (Zhu et al., 2018) ⇒ Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. (2018). “Texygen: A Benchmarking Platform for Text Generation Models.” In: Proceedings of The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR 2018). DOI:10.1145/3209978.3210080.
2017a
- (Zhang et al., 2017) ⇒ Yizhe Zhang, Zhe Gan, Kai Fan, Zhi Chen, Ricardo Henao, Dinghan Shen, and Lawrence Carin. (2017). "Adversarial Feature Matching for Text Generation". In: Proceedings of the 34th International Conference on Machine Learning (ICML 2017).
2017b
- (Li et al., 2017) ⇒ Jiwei Li, Will Monroe, Tianlin Shi, Sebastien Jean, Alan Ritter, and Dan Jurafsky. (2017). “Adversarial Learning for Neural Dialogue Generation.” In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). DOI:10.18653/v1/D17-1230.
2017c
- (Lin, Li, et al., 2017) ⇒ Kevin Lin, Dianqi Li, Xiaodong He, Ming-ting Sun, and Zhengyou Zhang. (2017). “Adversarial Ranking for Language Generation.” In: Proceedings of Advances in Neural Information Processing Systems 30 (NIPS-2017).
2017d
- (Che et al., 2017) ⇒ Tong Che, Yanran Li, Ruixiang Zhang, R. Devon Hjelm, Wenjie Li, Yangqiu Song, and Yoshua Bengio. (2017). “Maximum-Likelihood Augmented Discrete Generative Adversarial Networks.” In: ArXiv Preprint: 1702.07983.
2017e
- (Semeniuta et al., 2017) ⇒ Stanislau Semeniuta, Aliaksei Severyn, and Erhardt Barth. (2017). “A Hybrid Convolutional Variational Autoencoder for Text Generation.” In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). DOI:10.18653/v1/D17-1066.
2017f
- (Yu et al., 2017a) ⇒ Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. (2017). “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient.” In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI 2017).
2017g
- (Yu et al., 2017b) 7rArr; Yoshua Bengio (2017). "Creating Human-Level AI" (Presentation)]. In: Asilomar Conference on Beneficial AI.
- QUOTE: What’s Missing (to achieve AGI) … Actually understanding language (also solves generating), requiring enough world knowledge / commonsense
2017h
- https://github.com/pytorch/examples/tree/master/word_language_model
- QUOTE: This example trains a multi-layer RNN (Elman, GRU, or LSTM) on a language modeling task. By default, the training script uses the WikiText-2 dataset, provided. The trained model can then be used by the generate script to generate new text.
2016
- (Kusner & Hernndez-Lobato, 2016) ⇒ Matt J. Kusner, and Jose Miguel Hernndez-Lobato. (2016). "GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution". In: arXiv:1611.04051.
2015a
- (Bahdanau et al., 2015) ⇒ Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. (2015). “Neural Machine Translation by Jointly Learning to Align and Translate.” In: Proceedings of the Third International Conference on Learning Representations, (ICLR-2015).