Some intuition on time series analysis

Time series are table columns that can be plotted against time. When we analyse them, we often want to forecast their values. How do we do it? Read on.

There are basically three strategies to forecast time series: i) Use your judgement, ii) predict using other variables (eg., predict the price of cars with data about the price of steel,) and iii) use the past values of the time series itself. The first one is too subjective, and the second one tends to be too complex (it makes use of bias-prone econometric techniques.) So most of the time we go for the third option, which is called, in parlance, extrapolation.

You can extrapolate by finding a trend in the past data, weighting each value equally, and applying it to the future. Or by using autoregression, which means that you use the raw past values to do the prediction. Or you can apply moving averages, which is like autoregression, only now you smooth the past values to reduce the noise in them. But oftentimes the most convenient method is exponential smoothing, that takes the best of the other methods and leverages it.

If you do exponential smoothing, you use all your past data, smooth out all the noise, and give more weight to the most recent values (using an exponential function, hence the name.) This way, if you're trying to predict sales for next week, last week's values will be taken into account more than the sales made two months ago.

Now for some nice plots. This one's a series with trend, because it tends to move in a certain direction over time (source):

This one is a series with seasonality, which means that it goes up and down regularly in a repeating pattern, like every three months or every seven days, for instance (source):

And this one is a series with trend and seasonality:

A good predictive model extracts the past trend (if any) and extrapolates it forwards, and then incorporates the seasonality (if any,) so that the end result is a fully-fledged time series of the future.

When your time series doesn't show a discernible trend, you go for simple exponential smoothing. If it does show a trend but it's not seasonal, we use Holt's exponential smoothing. And, finally, if your data does have a trend and this trend does appear to be seasonal, the method of choice is Winter exponential smoothing. In practice, data scientists might try all three of them and assess which one gives a more accurate prediction.

The accuracy of our prediction is measured by computing the errors of the model, that is, the differences between the predicted values and the actual values. We typically report a summary of these errors and not the errors themselves (usual summaries are the Mean Absolute Error, the Root Mean Square Error and the Mean Absolute Percentage Error.) So, when comparing models, you normally want to pick the one that minimises these numbers.

As with any other statistical task, when doing time series forecasting you should be aware that there is uncertainty involved, so the predictions should never be taken as exactly what's going to happen. They are, simply, educated guesses that we make with the help of mathematics.

A great resource if you want to dig in a little bit more is this video here.

As always, thank you for reading. And don't forget to subscribe! See you next time!