8 Empirical Example: Neural Networks for Time Series Prediction

Jupyter notebook comparison of FNN, RNN, and LSTM for neural network prediction

The overall outcome may be summarized as follows: If we don’t apply any pre-aggregation to the lagged input series, the FNN is beaten by RNN and LSTM. However, the differences are marginal.

Question for Reflection

Suppose an FNN with 12 lagged inputs, a plain RNN, and an LSTM are compared on a 60-year monthly macro forecasting dataset. Which of the three models is most exposed to the vanishing-gradient problem from Chapter 6, and which validation design from Chapter 2 is appropriate for comparing them?

The plain RNN is most exposed to vanishing gradients on long sequences: the FNN has no recurrence at all, and the LSTM introduces the gated cell-state path of Chapter 7 precisely to mitigate the problem. For the comparison itself, the appropriate design is time-ordered (rolling or expanding window) cross-validation from Chapter 2, not random K-fold, because the forecast-origin information set must be respected and random splits would leak future observations into training.