2022 LargeLanguageModelsCanSelfImpro

From GM-RKB

(Redirected from Huang et al., 2022)

Jump to navigation Jump to search

(Huang et al., 2022) ⇒ Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, and Jiawei Han. (2022). “Large Language Models Can Self-improve.” In: arXiv preprint arXiv:2210.11610.

Subject Headings:

Notes

Cited By

http://scholar.google.com/scholar?q=%222022%22+Large+Language+Models+Can+Self-improve

Quotes

Abstract

Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4%->82.1% on GSM8K, 78.2%->83.0% on DROP, 90.0%->94.4% on OpenBookQA, and 63.4%->67.9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label. We conduct ablation studies and show that fine-tuning on reasoning is critical for self-improvement.

References

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2022 LargeLanguageModelsCanSelfImpro	Xuezhi Wang Yuexin Wu Jiaxin Huang Shixiang Shane Gu Le Hou Hongkun Yu Jiawei Han			Large Language Models Can Self-improve						2022

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=2022_LargeLanguageModelsCanSelfImpro&oldid=852525"

Facts

... more about "2022 LargeLanguageModelsCanSelfImpro"

Jiaxin Huang +, Shixiang Shane Gu +, Le Hou +, Yuexin Wu +, Xuezhi Wang +, Hongkun Yu + and Jiawei Han +

Large Language Models Can Self-improve +

2022 +