mT5 LLM

From GM-RKB
Jump to navigation Jump to search

A mT5 LLM is a multilingual LLM based on the T5 LLM and the mC4 corpus.



References

2024

  • (Hugging Face, 2024) ⇒ In: Hugging Face Documentation.
    • QUOTE: ... mT5 was only pre-trained on mC4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is usable on a downstream task, unlike the original T5 model. Since mT5 was pre-trained unsupervisedly, there’s no real advantage to using a task prefix during single-task fine-tuning. If you are doing multi-task fine-tuning, you should use a prefix.
    • Google has released the following variants:

2021