All Courses

Deep understanding about RNN, LSTM, GRU with Python programming

Nithya Rekha

2 years ago

The main objective of this post is to implement Recurrent Neural Networks (RNN), Long Short Term Memory(LSTM),Gated Recurrent Units (GRU) from scratch and to give you the clear explanation from end-to-end. We should learn to build Neural Network from scratch, which helps in clear understanding the  concept. In this post we will cover the topics like:
• Sequential Data
• Why Recurrent Neural Network (RNN) ?
• Background work of RNN
Before entering into the concept of RNN we should have an idea about Sequential data. Because RNN is an specialized Neural Network which can handle sequential  data.

Sequential Data

Any data which is having particular order, to form a meaning full information comes under sequence data.
Example: If you want to be free from corona, we should take proper measures like use face mask, applying sanitizer, maintaining social distance.
There is a sentence in the example which gives proper meaning. If we jumble the sentence machine could not identify the pattern or meaning of that sentence.

Why RNN ?

Recurrent Neural Network is a specialized network which deals with sequence data like Time-Series, Audio Format, Text data. We can solve text data problems by using NLP techniques like tf-idf, Bag of words, word2vec, but if we continue using this techniques, we will loose the sequential form. There comes the concept of RNN . Recurrent Neural Network has ability to maintain the sequence of data. As we know in RNN 'R' stands for Recurrent, it means it is performing the same task for the all the elements in sequence. Here the output of one element is transferred as input to the next element in sequence. In other words we can say RNN has memory to store the calculation which has been so far.

Background work of RNN

Let's assume we are having a data set with a set of reviews and a target variable.
D = {xi,yi}
Where i = 1 to n
X = The Review
Y = Level name of the particular review.
Our Target variable is binary classification. It represents only positive or negative i.e; (0 or 1).
Example: x1 = The quality of one plus 7 pro is good.
This x1 sentence represents positive. So, y1 = positive.
Assume
The = x11 (1st review, 1st word)
quality = x12 (1st review, 2nd word)
of = x13 (1st review, 3rd word)
one = x14 (1st review, 4nd word)
plus = x15 (1st review, 5nd word)
7 = = x16 (1st review, 6nd word)
pro = x17 (1st review, 7nd word)
is = x18 (1st review, 8nd word)
good = x19 (1st review, 9nd word)
x1 = x11 x12 x13 x14 x15 x16 x17 x18 x19
In the similar way
x2 = x21 x22 x23 x24 x25 x26......
x3 = x31 x32 x33 x34 x35 ............
.
.
.
xn = xn1 xn2 xn3 xn4 ....................
Now we are having sequence of words, we have to find the result (Positive or Negative).
Enjoyed reading this blog? Then why not share it with others. Help us make this AI community stronger.