Encoder-Only Transformer Model: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
No edit summary
Tag: New redirect
 
Line 1: Line 1:
A [[Encoder-Only Transformer-based Model]] is a [[transformer-based model]] that consists solely of an [[encoder architecture]].
#REDIRECT [[Encoder-Only Transformer-based Model]]
* <B>Context:</B>
** It can (typically) be responsible for [[encoding]] [[input sequence]]s into [[continuous representation]]s.
** It can (typically) process [[input token]]s through [[self-attention layer]]s to capture [[contextual relationship]]s.
** It can (typically) learn [[bidirectional context]] through [[masked language modeling]].
** It can (typically) generate [[contextual embedding]]s for [[downstream task]]s.
** It can (often) perform [[transfer learning]] via [[fine-tuning]].
** It can (often) handle [[multi-task learning]] through [[task-specific head]]s.
** ...
** It can range from being a [[Base Model]] to being a [[Large Model]], depending on its [[parameter count]].
** It can range from being a [[Task-Specific Model]] to being a [[General-Purpose Model]], depending on its [[training objective]]s.
** ...
* <B>Example(s):</B>
** an [[Encoder-Only Transformer-Based Language Model]], such as:
*** [[BERT Family]]s, such as:
**** [[BERT Model]] for [[general language understanding]].
**** [[RoBERTa Model]] for [[optimized training]].
**** [[ALBERT Model]] for [[parameter-efficient learning]].
*** [[XLM Family]]s, such as:
**** [[XLM Model]] for [[multilingual understanding]].
** ...
* <B>Counter-Example(s):</B>
** a [[Decoder-Only Transformer Model]], which focuses on [[sequence generation]].
** an [[Encoder-Decoder Transformer Model]], which uses both [[encoder]] and [[decoder]] components.
** a [[Recurrent Neural Network]], which uses [[sequential processing]] instead of [[parallel attention]].
* <B>See:</B> [[Encoder Architecture]], [[Self-Attention]], [[Bidirectional Model]], [[Encoder/Decoder Transformer Model]].
 
----
----
 
== References ==
 
=== 2023 ===
* chat
** An [[Encoder-Only Transformer Model]] consists solely of an [[encoder architecture]]. This model is responsible for encoding input sequences into continuous representations, which can be used for different [[NLP task]]s, including [[text classification]], [[sentiment analysis]], and [[named entity recognition]] A well-known example of an Encoder-Only Transformer Model is the [[BERT (Bidirectional Encoder Representations from Transformers) model]], developed by [[Google AI]].
 
----
__NOTOC__
[[Category:Concept]]
[[Category:Machine Learning]]
[[Category:Neural Architecture]]
[[Category:Quality Silver]]

Latest revision as of 13:57, 23 December 2024