Generative Model Training Algorithm

(Redirected from Generative Algorithm)
Jump to navigation Jump to search

A Generative Model Training Algorithm is a probabilistic learning algorithm that can be implemented by a generative model training system to produce a generative model (by directly estimating the prior probability of the target class and predictor variables).



  • Perplexity
    • There are several types of generative models used in machine learning and AI. Here are some of the most well-known examples:
      • Generative Adversarial Networks (GANs): GANs consist of two neural networks - a generator that creates synthetic data samples, and a discriminator that tries to distinguish between real and generated samples. They are trained in an adversarial manner, with the generator aiming to fool the discriminator, and the discriminator trying to correctly identify real vs. fake samples.[3]
      • Variational Autoencoders (VAEs): VAEs are a type of generative model that learns the underlying probability distribution of the training data in an unsupervised manner. They consist of an encoder network that maps input data to a latent space, and a decoder network that generates new samples from the latent space.[2]
      • Autoregressive Models: Autoregressive models, such as PixelRNN and PixelCNN, generate data sequentially, predicting one element (e.g., pixel or word) at a time based on the previously generated elements.[2]
      • Flow-Based Generative Models: These models, like Real NVP and Glow, learn an invertible transformation from a simple probability distribution (e.g., Gaussian) to a complex data distribution, allowing for efficient sampling and exact likelihood computation.[2]
      • Diffusion Models: Diffusion models, such as DDPM and Stable Diffusion, formulate the generation process as a sequence of denoising steps, where a neural network is trained to predict and remove the noise added to the data at each step, ultimately generating high-fidelity samples from pure noise.[2]
    • Citations:




  • (Wick et al., 2009) ⇒ Michael Wick, Aron Culotta, Khashayar Rohanimanesh, and Andrew McCallum. (2009). “An Entity Based Model for Coreference Resolution.” In: Proceedings of the SIAM International Conference on Data Mining (SDM 2009).
    • Statistical approaches to coreference resolution can be broadly placed into two categories: generative models, which model the joint probability, and discriminative models that model that conditional probability. These models can be either supervised (uses labeled coreference data for learning) or unsupervised (no labeled data is used). Our model falls into the category of discriminative and supervised.


  • (Bouchard & Triggs, 2004) ⇒ Guillaume Bouchard, and Bill Triggs. (2004). “The Trade-off Between Generative and Discriminative Classifiers.” In: Proceedings of COMPSTAT 2004.
    • QUOTE: … In supervised classification, inputs [math]\displaystyle{ x }[/math] and their labels [math]\displaystyle{ y }[/math] arise from an unknown joint probability [math]\displaystyle{ p(x,y) }[/math]. If we can approximate [math]\displaystyle{ p(x,y) }[/math] using a parametric family of models [math]\displaystyle{ G = \{p_θ(x,y),\theta \in \Theta\} }[/math], then a natural classifier is obtained by first estimating the class-conditional densities, then classifying each new data point to the class with highest posterior probability. This approach is called generative classification.

      However, if the overall goal is to find the classification rule with the smallest error rate, this depends only on the conditional density [math]\displaystyle{ p(y \vert x) }[/math]. Discriminative methods directly model the conditional distribution, without assuming anything about the input distribution p(x). Well known generative-discriminative pairs include Linear Discriminant Analysis (LDA) vs. Linear logistic regression and naive Bayes vs. Generalized Additive Models (GAM). Many authors have already studied these models e.g. [5,6]. Under the assumption that the underlying distributions are Gaussian with equal covariances, it is known that LDA requires less data than its discriminative counterpart, linear logistic regression [3]. More generally, it is known that generative classifiers have a smaller variance than.

      Conversely, the generative approach converges to the best model for the joint distribution p(x,y) but the resulting conditional density is usually a biased classifier unless its pθ(x) part is an accurate model for p(x). In real world problems the assumed generative model is rarely exact, and asymptotically, a discriminative classifier should typically be preferred [9, 5]. The key argument is that the discriminative estimator converges to the conditional density that minimizes the negative log-likelihood classification loss against the true density p(x, y) [2]. For finite sample sizes, there is a bias-variance tradeoff and it is less obvious how to choose between generative and discriminative classifiers.


  1. T. Mitchell, Generative and Discriminative Classifiers: Naive Bayes and Logistic Regression. Draft Version, 2005 download
  2. A. Y. Ng and M. I. Jordan. On Discriminative vs. Generative Classifiers: A comparison of logistic regression and Naive Bayes. in NIPS 14, 2002. download