Activation Functions in Neural Network

Kajal Pawar

2 years ago

Activation functions are a very important component of neural networks in deep learning. It helps us to determine the output of a deep learning model, its accuracy, and also the computational efficiency of training a model. They also have a major effect on how the neural networks will converge and what will be the convergence speed. In some cases, the activation functions might also prevent neural networks from convergence. So, let’s understand the activation functions, types of activation functions & their importance and limitations in details.

What is the activation function?

Activation functions help us to determine the output of a neural network. These types of functions are attached to each neuron in the neural network, and determines whether it should be activated or not, based on whether each neuron’s input is relevant for the model’s prediction.Activation function also helps us to normalize the output of each neuron to a range between 1 and 0 or between -1 and 1.As we know, sometimes the neural network is trained on millions of data points, So the activation function must be efficient enough that it should be capable of reducing the computation time and improve performance.

Let’s understand how it works?

In a neural network, inputs are fed into the neuron in the input layer. Where each neuron has a weight and multiplying the input number with the weight of each neurons gives the output of the neurons, which is then transferred to the next layer and this process continues. The output can be represented as:            Y = ∑ (weights*input + bias) Note: The range of Y can be in between -infinity to +infinity. So, to bring the output into our desired prediction or generalized results we have to pass this value from an activation function.The activation function is a type of mathematical “gate” in between the input feeding the current neuron and its output going to the next layer. It can be as simple as a step function that turns the neuron output on and off, depending on a rule or threshold what is provided. The final output can be represented as shown below:                         Y = Activation function(∑ (weights*input + bias))

