Loss functions (Incorrect predictions penalty)

About

Loss functions define how to penalize incorrect predictions. The optimization problems associated with various linear classifiers are defined as minimizing the loss on training points (sometime along with a regularization term).

They can also be used to evaluate the quality of models.

Type

Regression

Squared loss = <math>(y-\hat{y})^2</math>

Classification

0-1

0-1 loss: Penalty is 0 for correct prediction, and 1 otherwise

As 0-1 loss is not convex, the standard approach is to transform the categorical features into numerical features: (See Statistics - Dummy (Coding|Variable) - One-hot-encoding (OHE)) and to use a regression loss.

Log

Log loss is defined as: <MATH> \begin{align} \ell_{log}(p, y) = \begin{cases} -\log (p) & \text{if } y = 1\\\ -\log(1-p) & \text{if } y = 0 \end{cases} \end{align} </MATH> where

<math>p</math> is a probability between 0 and 1. A base probability for a binary event will be just the mean over the training targetThen it can be compared to the output of probabilistic model such as the logistic regression
<math>y</math> is a label of either 0 or 1.

Log loss is a standard evaluation criterion when predicting rare-events such as click-through rate prediction

Python

from math import log

def computeLogLoss(p, y):
    """Calculates the value of log loss for a given probabilty and label.
    Args:
        p (float): A probabilty between 0 and 1.
        y (int): A label.  Takes on the values 0 and 1.

    Returns:
        float: The log loss value.
    """
    epsilon = 10e-12
    if p==0:
        p = epsilon
    elif p==1:
        p = p - epsilon
        
    if y == 1:
        return -log(p)
    elif y == 0:
        return -log(1-p)