Deep Neural Network-based Language Model (NLM) Training System

From GM-RKB
Jump to navigation Jump to search

A Deep Neural Network-based Language Model (NLM) Training System is a Neural Network-based Language Modeling System that implements a Deep Neural Network-based Language Model Training Algorithm.



Context

2022

  • https://techcrunch.com/2022/12/30/theres-now-an-open-source-alternative-to-chatgpt-but-good-luck-running-it/ 
    • QUOTE: ... It’s an expensive process, collecting the training data. And training itself isn’t cheap. PaLM is 540 billion parameters in size, “parameters” referring to the parts of the language model learned from the training data. A 2020 study pegged the expenses for developing a text-generating model with only 1.5 billion parameters at as much as $1.6 million. And to train the open source model Bloom, which has 176 billion parameters, it took three months using 384 Nvidia A100 GPUs; a single A100 costs thousands of dollars. ...

      ... Running a trained model of PaLM + RLHF’s size isn’t trivial, either. Bloom requires a dedicated PC with around eight A100 GPUs. Cloud alternatives are pricey, with back-of-the-envelope math finding the cost of running OpenAI’s text-generating GPT-3 — which has around 175 billion parameters — on a single Amazon Web Services instance to be around $87,000 per year. ...