Recurrent neural networks (RNNs) are non-autonomous dynamical systems driven by input, the behaviour of which depend on both model parameters and inputs to the system. To describe responses of a (trained) RNN to the full range of possible inputs requires going beyond the theory of autonomous dynamical systems to consider more general nonautonomous dynamical systems, where the equations ruling the dynamics change over time. Unfortunately, the theory of nonautonomous dynamical systems is much less developed than that of autonomous systems. Even basic notions such as convergence (and hence attractors) need to be carefully defined.
Starting with the work of Herbert Jaeger, several authors have proposed that a successfully trained RNN should have the so-called “echo state” property (ESP). This idea gave rise to a training paradigm of RNNs called reservoir computing and a class of RNNs known as echo state networks (ESNs). An ESN is an RNN that is relatively easy to train, since optimisation is restricted to the output layer and recurrent connections are left untouched after initialisation. If an RNN possesses the ESP, then it will, for a given input sequence, asymptotically produce the same sequence of states and will “forget” any internal state, ending up following a unique (possibly complex) trajectory in response to that input. This trajectory represents the solution of the specific problem encoded in the input sequence, so the system acts as a filter that transforms the input sequence into a unique sequence of output.
The presence of the ESP has been historically associated with reliable RNN behaviour. Accordingly, the loss of ESP has been directly associated with the loss of reliable behaviour in RNNs, implying that correct computation is not possible without ESP. In a new paper, LML External Fellow Claire Postlethwaite and colleagues question this view with an analysis of ESPs based on nonautonomous dynamical system theory. They highlight how the echo state property, which guarantees the existence of a unique (stable) response to an input sequence, may be generalised. A recurrent neural network might reliably produce several stable responses to an input sequence, and they introduce an “echo index” which counts such stable responses. The presence of more than one stable response indicates the possibility to observe and exploit multiple, yet consistent behaviours of a recurrent neural network driven by an input sequence. Moreover, an echo index greater than one might also indicate incorrect training and hence signal possible malfunctioning on a task requiring a unique behaviour in phase space. The authors suggest that the results introduced here provide tools to pursue the ambitious goal of developing mechanistic models describing the behaviour of recurrent neural networks in machine learning tasks, such as time series classification and forecasting.
The paper is available here.
Leave a Reply