https://papers.nips.cc/paper/9689-legendre-memory-units-continuous-time-representation-in-recurrent-neural-networks.pdf

tldr; LMU's are a new memory cell for RNNs that perform 100x better than a LSTM's memory cell. RNN's are notoriously bad at long-term dependencies and LMU's find a way to dynamically retain information across long windows using comparatively fewer resources. An LSTM has one time scale, while the LMU uses Legendre polynomials to map to multiple windows of time.