Long Short-Term Memory (LSTM)
Go back to the [[AI Glossary]]
#seq
A type of cell in a recurrent neural network used to process sequences of data in applications such as handwriting recognition, machine translation, and image captioning. LSTMs address the vanishing gradient problem that occurs when training RNNs due to long data sequences by maintaining history in an internal memory state based on new input and context from previous cells in the RNN.
long short-term memory - LSTM
Go to [[Week 2 - Introduction]] or back to the [[Main AI Page]] Part of the pages on [[Artificial Intelligence/Introduction to AI/Week 2 - Introduction/Natural Language Processing]] and [[Attention Mechanism]].
A deep learning architecture in the same way [[CNNS - Convolutional neural networks]] are.
These algorithms are able to forget, according to the RNN section of the beginners' guide to NLP
According to wikipedia:
A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.
Source - Wikipedia.
An LSTM is made up of a:
- cell
- an input gate
- an output gate
- a forget gate
The difference between LSTMs and [[GRUs]]
- public document at doc.anagora.org/long_short-term_memory_(lstm)
- video call at meet.jit.si/long_short-term_memory_(lstm)