Seyeon An April 30, 2021

<aside> 🔗 We have reposted this blog on our Medium publication. Read this on Medium.

</aside>

Ranging from the fluctuating stock price to the complex traffic situation, most of the data we see daily are represented as time series. Such time series data have temporal patterns, and these patterns can be both clear and ambiguous. Time series forecasting is thus a common problem which is actively studied in machine learning and data mining.

Most time series data are multivariate. This refers to time series variables that have correlations to each other. We utilize the relationships between these variables to perform the jobs above, as predicting the traffic of a road, predicting the electricity consumption of a city, and predicting the price of a stock. In other words, we can hugely improve the accuracy of time series forecasting using the relationship in between different regions, different cities, and different stocks.

Thus, the core problem of the paper is to improve the multivariate time series forecasting model, which handles the problem below:

<aside> 💡 Given multivariate time series $X$—with the number of variables $d$ and the number of recent observations $w$—and prediction horizon $h$, predict the observation $y$ after $h$ time steps.

</aside>

Multivariate Time Series Forecasting

There currently exist a number of models for multivariate time series forecasting. Yet, existing models require too many parameters, since it considers the patterns in each variable and the relationships between different variables in a single step. Since there are many variables to predict with insufficient training data in the case of time series data, overfitting is very common, which lowers prediction model efficiency and makes hyperparameter tuning very hard. The trend of time series often changes over time, which forces complex models to fail at future predictions.

There have been a few past attempts for building such model. For example, the LSTNet (Lai et al., SIGIR 2018) model showed great accuracy using the RNN (recurrent neural networks) structure, the convolution kernels, and the temporal attention on time scale, but showed low efficiency because it required many parameters. Thus, the goal of this work is to build an accurate and efficient forecasting model, which improves both the accuracy and the efficiency of the existing forecasting models.

Attention-Based Autoregression (AttnAR)

This paper proposes an AttnAR (Attention-Based Autoregression) model which shows great performance using simple attention operations. The structure of the AttnAR model is illustrated in the figure below, and the main ideas of the model can be illustrated as below:

Separable Modules in AttnAR

Separable Modules in AttnAR

Mixed Convolution Extractor in AttnAR

Mixed Convolution Extractor in AttnAR