Activation functions are a very important component of
neural networks in deep learning. It helps us to determine the output of a deep
learning model, its accuracy, and also the computational efficiency of training
a model. They also have a major effect on how the neural networks will converge
and what will be the convergence speed. In some cases, the activation functions
might also prevent neural networks from convergence. So, let’s understand the activation functions, types of activation functions & their importance and limitations in details.
What is the activation
function?
Activation functions help us to determine the output
of a neural network. These types of functions are attached to each neuron in
the neural network, and determines whether it should be activated or not, based
on whether each neuron’s input is relevant for the model’s prediction.Activation function also helps us
to normalize the output of each neuron to a range between 1 and 0 or
between -1 and 1.As we know, sometimes the neural
network is trained on millions of data points, So the activation function must
be efficient enough that it should be capable of reducing the computation time
and improve performance.
Let’s understand how it
works?
In a neural network, inputs are fed into the neuron in
the input layer. Where each neuron has a weight and multiplying the input
number with the weight of each neurons gives the output of the neurons, which
is then transferred to the next layer and this process continues. The output
can be represented as:Y = ∑
(weights*input + bias) Note: The range
of Y can be
in between -infinity to +infinity. So, to bring the
output into our desired prediction or generalized results we
have to pass this value from an activation function.The activation function is a type of mathematical
“gate” in between the input feeding the current neuron and its output going to
the next layer. It can be as simple as a step function that turns the neuron
output on and off, depending on a rule or threshold what is
provided. The final output can be represented as shown below: Y = Activation function(∑
(weights*input + bias))