2007 ConditionalGraphicalModels

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Conditional Graphical Model.

Notes

Cited By

Quotes

Body

In this chapter we propose a modification of CRF-like algorithms that allows for solving large-scale structured classification problems. Our approach consists in upper bounding the CRF functional in order to decompose its training into independent optimisation problems per clique. Furthermore we show that each sub-problem corresponds to solving a multiclass learning task in each clique, which enlarges the applicability of these tools for large-scale structural learning problems. Before presenting the Conditional Graphical Model (CGM), as we refer to this procedure, we review the family of CRF algorithms. We concentrate on the best known procedures and standard generalisations of CRFs. The objective of this introduction is analysing from the same viewpoint the proposed solutions in the literature to tackle this problem, which allows comparing their different features. We complete the chapter with a case study, in which we show the possibility to work with large-scale problems using CGM and that the obtained performance is comparable to the result with CRF-like algorithms.

1.1 Introduction

Conditional Random Fields (CRFs) address this general problem as an extension of logistic regression for multiclass problems. …

Although the use of an undirected graphical model makes these procedures tractable, they are difficult to train, needing custom-made optimization tools, and they cannot solve large-scale problems. In this chapter, we present (kernel) Conditional Graphical Models (CGMs), which simplifies the training phase of CRF-like algorithms to solve large-scale structured classification problems. This algorithmic proposal is based on the same principle used for solving the binary classification problem, in which the 0–1 loss is replaced by a convex upper bound to ensure the optimal solution can be easily computed. In our case, we replace the CRF-like algorithm loss by another convex loss-function that decouples the training of each clique in the undirected graph. CGM optimisation is solved independently per clique using any general multi-classification algorithm. CGM complexity depends on the selected multiclass learning tool, so it opens the possibility of solving large-scale problems as there are inexpensive multiclass tools, such as SVMs.

1.2 A Unifying Review

We start from a probabilistic model that enables us to compute the posterior probability for all the possible labellings: p(y∗|x∗,D). We are interested in finding the labelling with highest probability: maxy∗{p(y∗|x∗,D)}. From the Maximum a Posteriori (MAP) approach to learning models, we make the connection with regularised risk minimization, and solutions to the problem using non-probabilistic discriminative methods, such as Support Vector Machines (SVMs).

Definition 1 (Conditional Random Field) Let G = (V,E) be a graph such that yn is indexed by the vertices of G, then (xn, yn) is a conditional random field in case, when conditioned on xn, the random variables yn = [yn1, yn2, . . ., ynL] obey the Markov property with respect to the graph:

References


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 ConditionalGraphicalModelsZoubin Ghahramani
Fernando Perez-Cruz
Massimiliano Pontil
Conditional Graphical Modelshttp://learning.eng.cam.ac.uk/zoubin/papers/CGM.pdf