# Minimum Description Length Principle

A minimum description length principle is an algorithmic priciple that the best hypothesis for a data set is the one that leads to the largest data compression performance.

## References

### 2008

• http://www.modelselection.org/mdl/
• QUOTE: Minimum description length (MDL) (Rissanen 1978) is a technique from algorithmic information theory which dictates that the best hypothesis for a given set of data is the one that leads to the largest compression of the data. We seek to minimize the sum of the length, in bits, of an effective description of the model and the length, in bits, of an effective description of the data when encoded with the help of the model. Sewell (2006)
• QUOTE: "Minimum description length (MDL) (Rissanen 1978) uses an information theoretic measure. Kolmogorov complexity of a dataset is defined as the shortest description of the data. If the data is simple, it has a short complexity; for example, if it is a sequence of “0”s, we can just write “0” and the length of the sequence. If the data is completely random, then we cannot have any description of the data shorter than the data itself. If a model is appropriate for the data, then it has a good fit to the data, and instead of the data, we can send/store the model description. Out of all the models that describe the data, we want to have the simplest model so that it lends itself to the shortest description. So we again have a trade-off between how simple the model is and how well it explains the data." Alpaydin, 2004

### 1998

• (Grünwald, 1998) ⇒ Peter Grünwald. (1998). “The Minimum Description Length Principle and Reasoning under Uncertainty.
• ABSTRACT: To be able to forecast future events, science wants to infer general laws and principles from particular instances. This process of inductive inference is a central theme in statistics, pattern recognition and the branch of Artificial Intelligence called `machine learning'. The Minimum Description Length (MDL) Principle is a relatively recent method for inductive inference. It has its roots in information theory and theoretical computer science (Kolmogorov complexity) rather than statistics. The fundamental idea behind the MDL Principle is that any regularity in a given set of data can be used to compress the data, i.e. to describe it using fewer symbols than needed to describe the data literally. The more regularities there are in the data, the more we can compress it. This leads to the view (which is just a version of Occam's famous razor) that the more we can compress a given set of data, the more we can say we have learned about the data.
• QUOTE: The fundamental idea behind the MDL Principle is that any regularity in a given set of data can be used to compress the data, i.e. to describe it using fewer symbols than needed to describe the data literally.

### 1978

• (Rissanen, 1978) ⇒ Jorma Rissanen. (1978). “Modeling by Shortest Data Descriptions.” doi:10.1016/0005-1098(78)90005-5