The process of forecasting in Time Series data is broken into few steps:
Data Preparation: Prepare Data in correct format. Process involves:
Visualize the data
Define a model/algorithm: Specifying an appropriate model for the data is essential for producing appropriate forecasts.
Train the model (estimate): Once an appropriate model is selected, we next train the model on some data. It may result in one model per key (e.g. country for GDP data).
Check model performance: Once a model has been trained, check how well it has performed on the data. There are several diagnostic tools available to check model behaviour, and also accuracy measures that allow one model to be compared against another.
Produce Forecast: The easiest way to use this function is by specifying the number of future observations to forecast, for example, forecast for next 10 observations and for next 2 years. If model uses additional information from the data (e.g. exogenous variables), that can be included in the dataset of observations to forecast.
The result of forecast includes forecasting distribution (normal distribution) and the point forecast (i.e. mean or average of the forecasting distribution)
The forecasts can be plotted along side historical data.
Four simple forecasting methods as benchmark. Sometimes one of these simple methods will be the best forecasting method available; but in many cases, these methods will serve as benchmarks rather than the method of choice. That is, any forecasting methods we develop will be compared to these simple methods to ensure that the new method is better than these simple alternatives. If not, the new method is not worth considering.
Because a naïve forecast is optimal when data follow a random walk (see Section 9.1), these are also called random walk forecasts.
Fitted Values: Each observation in a time series can be forecast using all previous observations. These are called fitted values.
Say if a Time Series has 10 observations, we can forecast the value of the 8th observation. That will be called the fitted value of 8th observation. Now, if the forecasting method uses any parameter estimated using all the observations (1st to 10th), then fitted values are not true forecasts. For example, for average method, the fitted value for the 8th observation is equal to the average calculated using all the observations (1st to 10th). Thus the fitted value for 8th observation is not a true forecast.
On the other hand, naïve or seasonal naïve forecasts do not involve any parameters, and so fitted values are true forecasts in such cases.
Residuals: The “residuals” in a time series model are what is left over after fitting a model. For many (but not all) time series models, the residuals are equal to the difference between the observations and the corresponding fitted values:
Residuals are useful in checking whether a model has adequately captured the information in the data. If patterns are observable in the residuals, the model can probably be improved.
A good forecasting method will result in residuals with the following propoerties:
Any forecasting method that does not satisfy 1 & 2 properties can be improved, i.e., it is not using all the available information. Fixing the bias problem (2) is easy: if the residuals have mean "m", then simply add "m" to all forecasts and the bias problem is solved. Fixing the correlation problem is harder and will be discussed in Chapter 9.
Properties 3 & 4 make the calculation of prediction intervals easier. However, if a forecasting method does not satisfy 3 and 4, it may or may not be improved. Sometimes applying a Box-Cox transformation may help. But otherwise there is usually little that you can do to ensure 3 and 4. Instead, an alternative approach to obtaining prediction intervals is necessary.