Sigmoid

dl_activation_function_sigmoid.png
Source: medium.com

The Sigmoid function is a type of activation function that is traditionally used in neural networks and deep learning models. The main reason for using the sigmoid function is that it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.

The sigmoid function also has a very nice derivative feature, which can be written in terms of the original sigmoid function, making calculations more convenient when applying backpropagation in neural networks.

The formula of the sigmoid function is given as:

where:

  • is the output of the sigmoid function,
  • is Euler's number (the base of natural logarithms),
  • is the input to the function.

The derivative of the sigmoid function can be expressed as:

This property makes it very computationally efficient during gradient descent algorithm while training a neural network.

However, one major disadvantage of the Sigmoid activation function is that it can cause a neural network to get stuck at the training time if initial weights are not set properly. This problem arises due to its property known as "vanishing gradients".