Multinomial Logistic Regression Algorithm
A Multinomial Logistic Regression Algorithm is a Logistic Regression Algorithm that is a Multinomial Regression Algorithm.
- AKA: Softmax Activation Function Regression.
- See: Softmax Activation Function, Logistic Regression, Categorical Distribution, Dependent Variable, Independent Variable, Multinomial Distribution, Categorical Data, Machine Learning, Classifier (Machine Learning), Classification Rule, Naive Bayes Classifier, Statistical Independence, Maximum A Posteriori.
References
2013
- (Wikipedia, 2013) ⇒ http://en.wikipedia.org/wiki/Multinomial_logistic_regression#regression Retrieved:2013-11-30.
- Template:Regression bar
In statistics, a multinomial logistic regression model, also known as softmax regression or multinomial logit, is a regression model which generalizes logistic regression by allowing more than two discrete outcomes. [1] That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.). The use of the term "multinomial" in the name arises from the common conflation between the categorical and multinomial distributions,[citation needed] as explained in the latter article. However, it should be kept in mind that the actual goal of the multinomial logistic model is to predict categorical data.
In some fields of machine learning (e.g. natural language processing), when a classifier is implemented using a multinomial logit model as the classification rule, it is commonly known as a maximum entropy classifier, conditional maximum entropy model or MaxEnt model for short. Maximum entropy classifiers are commonly used as alternatives to naive Bayes classifiers because they do not assume statistical independence of the random variables (commonly known as features) that serve as predictors. However, learning in such a model is slower than for a naive Bayes classifier, and thus may not be appropriate given a very large number of classes to learn. In particular, learning in a Naive Bayes classifier is a simple matter of counting up the number of co-occurrences of features and classes, while in a maximum entropy classifier the weights, which are typically maximized using maximum a posteriori (MAP) estimation, must be learned using an iterative procedure; see below.
- Template:Regression bar
- ↑ Greene, William H., Econometric Analysis, fifth edition, Prentice Hall, 1993: 720-723.