Deep Neural Network-based Language Model (NLM) Training System

Context:
- It can be an expensive system.
- …
Counter-Example(s):
- a Large NLM Inference System.
See: Deep Neural Network Training System, Deep Neural Network-based Language Model.

Context

https://techcrunch.com/2022/12/30/theres-now-an-open-source-alternative-to-chatgpt-but-good-luck-running-it/
- QUOTE: ... It’s an expensive process, collecting the training data. And training itself isn’t cheap. PaLM is 540 billion parameters in size, “parameters” referring to the parts of the language model learned from the training data. A 2020 study pegged the expenses for developing a text-generating model with only 1.5 billion parameters at as much as $1.6 million. And to train the open source model Bloom, which has 176 billion parameters, it took three months using 384 Nvidia A100 GPUs; a single A100 costs thousands of dollars. ...
  ... Running a trained model of PaLM + RLHF’s size isn’t trivial, either. Bloom requires a dedicated PC with around eight A100 GPUs. Cloud alternatives are pricey, with back-of-the-envelope math finding the cost of running OpenAI’s text-generating GPT-3 — which has around 175 billion parameters — on a single Amazon Web Services instance to be around $87,000 per year. ...