There are 2 types of time series data that we store: actual data and predicted data.
For time series data, the fields could be divided into the following categories:
For predicted time series data, additional categories of fields are added:
Let's say we have the cargo flow data of 10 customers for 10 origins going to 10 destinations and they are shipping 10 kinds of products. Each row stores the quantity of product shipped from 1 origin to 1 destination for one product of 1 customer on a particular date. There would be around 1010101030 = 300k rows for 30-day worth of data (actual number of rows may vary because not all dates have shipments for all the combinations). On top of that, let's assume we make predictions beginning of every week for the next 3 months. So it will take
Normally, for the time series we store, we have several years' worth of actual data to begin with. The volume of this data is enough to hurt the query performance and data ingestion speed. On top of that, we need to delete large amount of data fairly frequently: