A very superficial treatment on the subject of RNN’s. You don’t go into any of the reasons RNN’s have value, if any. You don’t mention any weight parameters, nor show them in your diagrams. The diagrams have been copied from another source that I have already seen, published by somebody else. Have you actually tried implementing RNN’s on your own? I mean… not using someone else’s software package or high level library? I have, and it isn’t so obvious that LSTM’s or GRU’s do what they are advertised to do.