# Logistic (Log) Loss Function

(Redirected from Log Loss)

A Logistic (Log) Loss Function is a convex loss function that is a log function.

**AKA:**Logistic (Log) Loss Function.**Context:**- output: a Log Loss Value (that ranges from 0 to 1).
- It can (often) be used for a Binary Classification Task with Predicted Probability.

**Example(s):****Counter-Example(s):**- a Hinge-Loss Function, as used by SVMs.
- a Cross-Entropy Measure.

**See:**Squared Error Function.

## Referneces

### 2018a

- http://wiki.fast.ai/index.php/Log_Loss
- QUOTE: Logarithmic loss (related to cross-entropy) measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0. Log loss increases as the predicted probability diverges from the actual label. So predicting a probability of 0.012 when the actual observation label is 1 would be bad and result in a high log loss. There is a more detailed explanation of the justifications and math behind log loss here. …
… To calculate log loss from scratch, we need to include the MinMax function (see below). Numpy implements this for us with np.clip().

- QUOTE: Logarithmic loss (related to cross-entropy) measures the performance of a classification model where the prediction input is a probability value between 0 and 1. The goal of our machine learning models is to minimize this value. A perfect model would have a log loss of 0. Log loss increases as the predicted probability diverges from the actual label. So predicting a probability of 0.012 when the actual observation label is 1 would be bad and result in a high log loss. There is a more detailed explanation of the justifications and math behind log loss here. …

def logloss(true_label, predicted, eps=1e-15): p = np.clip(predicted, eps, 1 - eps) if true_label == 1: return -log(p) else: return -log(1 - p)

### 2018b

- http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html#theano.tensor.nnet.nnet.sigmoid_binary_crossentropy
- QUOTE: It is equivalent to binary_crossentropy(sigmoid(output), target), but with more efficient and numerically stable computation, especially when taking gradients.

### 2017a

- http://wiki.fast.ai/index.php/Log_Loss#Log_Loss_vs_Cross-Entropy
- QUOTE: Log loss and cross-entropy are slightly different depending on the context, but in machine learning when calculating error rates between 0 and 1 they resolve to the same thing. As a demonstration, where p and q are the sets p∈{y, 1−y} and q∈{ŷ, 1−ŷ} we can rewrite cross-entropy as:
- p = set of true labels
- q = set of prediction
- y = true label
- ŷ = predicted prob

- Which is exactly the same as log loss!

- QUOTE: Log loss and cross-entropy are slightly different depending on the context, but in machine learning when calculating error rates between 0 and 1 they resolve to the same thing. As a demonstration, where p and q are the sets p∈{y, 1−y} and q∈{ŷ, 1−ŷ} we can rewrite cross-entropy as:

### 2017b

from math import log

def log_loss(predicted, target): if len(predicted) != len(target): print 'lengths not equal!' return target = [float(x) for x in target] # make sure all float values predicted = [min([max([x,1e-15]),1-1e-15]) for x in predicted] # within (0,1) interval return -(1.0/len(target))*sum([target[i]*log(predicted[i]) + \ (1.0-target[i])*log(1.0-predicted[i]) \ for i in xrange(len(target))])

if __name__=='__main__': # if you run at the command line as 'python utils.py' actual = [0, 1, 1, 1, 1, 0, 0, 1, 0, 1] pred = [0.24160452, 0.41107934, 0.37063768, 0.48732519, 0.88929869, 0.60626423, 0.09678324, 0.38135864, 0.20463064, 0.21945892] print log_loss(pred,actual)

### 2016

def log_loss(solution, prediction, task = 'binary.classification'):Log loss for binary and multiclass.[sample_num, label_num] = solution.shape eps = 1e-15 pred = np.copy(prediction) # beware: changes in prediction occur through this sol = np.copy(solution) if (task == 'multiclass.classification') and (label_num>1): # Make sure the lines add up to one for multi-class classification norma = np.sum(prediction, axis=1) for k in range(sample_num): pred[k,:] /= sp.maximum (norma[k], eps) # Make sure there is a single label active per line for multi-class classification sol = binarize_predictions(solution, task='multiclass.classification') # For the base prediction, this solution is ridiculous in the multi-label case # Bounding of predictions to avoid log(0),1/0,... pred = sp.minimum (1-eps, sp.maximum (eps, pred)) # Compute the log loss pos_class_log_loss = - mvmean(sol*np.log(pred), axis=0) if (task != 'multiclass.classification') or (label_num==1): # The multi-label case is a bunch of binary problems. # The second class is the negative class for each column. neg_class_log_loss = - mvmean((1-sol)*np.log(1-pred), axis=0) log_loss = pos_class_log_loss + neg_class_log_loss # Each column is an independent problem, so we average. # The probabilities in one line do not add up to one. # log_loss = mvmean(log_loss) # print('binary {}'.format(log_loss)) # In the multilabel case, the right thing i to AVERAGE not sum # We return all the scores so we can normalize correctly later on else: # For the multiclass case the probabilities in one line add up one. log_loss = pos_class_log_loss # We sum the contributions of the columns. log_loss = np.sum(log_loss) #print('multiclass {}'.format(log_loss)) return log_loss

### 2015

- http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html
- Log loss, aka logistic loss or cross-entropy loss.
- QUOTE: This is the loss function used in (multinomial) logistic regression and extensions of it such as neural networks, defined as the negative log-likelihood of the true labels given a probabilistic classifier’s predictions. For a single sample with true label yt in {0,1} and estimated probability yp that yt = 1, the log loss is : [math] -\log P(yt|yp) = -(yt log(yp) + (1 - yt) log(1 - yp))[/math]

### 2014

- https://www.kaggle.com/wiki/LogarithmicLoss
- QUOTE: [math]\operatorname{log loss} = -\frac{1}{N}\sum_{i=1}^N\sum_{j=1}^My_{ij}\log(p_{ij})[/math]