Softmax

dl_activation_function_softmax.png
Quelle: PapersWithCode.com

The Softmax activation function, also known as the normalized exponential function, is a type of activation function that is primarily used in the output layer of a neural network for multi-class classification problems. It converts a vector of numbers into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector.

The formula for the Softmax function is:

In this formula:

  • represents the input vector to the Softmax function,
  • refers to the j-th element of this input vector,
  • is the total number of classes in our multi-class classification problem.

The denominator, , acts as a normalization term to ensure that all values output by the Softmax function sum up to 1. This makes it possible to interpret these output values as probabilities.

The Softmax function is differentiable which makes it suitable for gradient-based optimization methods such as backpropagation.