Time Series Forecasting with XGBoost and LightGBM: Predicting Energy Consumption with Lag Features

<p>In a previous&nbsp;article, we&rsquo;ve gone through the process of creating a model capable of predicting the demand of energy consumption for the city of London. Essentially, it was a time series forecasting problem where we had utilized the&nbsp;London Energy Dataset&nbsp;and the&nbsp;London Weather Dataset&nbsp;to build Ensemble models, such as XGBoost and LGBM, in order to accurately estimate the future needs of electric power. At first, we approached the task by modeling its&nbsp;<em>time-dependent</em>&nbsp;properties. Then, we added auxiliary information by incorporating weather data which improved the results by a significant margin. In this article, we will use the so-called&nbsp;<em>lag&nbsp;</em>features to boost the model performance even further.</p> <blockquote> <p>The current article will use the&nbsp;previous&nbsp;one as a basis from which we will improve our models. I strongly advise you to meticulously read it before delving into this one, since we will be using its entire codebase.</p> </blockquote> <h1>Lag Features</h1> <p>As it is already mentioned, we&rsquo;ve tackled the problem by exploring its&nbsp;<em>time-dependent</em>&nbsp;features. Even though these are generally considered to be the most influential in a time series scenario, there are additional properties that can be used to better model the task, like its&nbsp;<em>serially dependent</em>&nbsp;properties. To make use of these, one has to integrate past values of the target variable as input features. These past values are what we call&nbsp;<em>lag&nbsp;</em>features. For instance, in our problem we can use the demand of the previous day as an input feature while estimating the demand of the current day. Of course, we can include as many past days as we want where each will be considered as a separate lag feature.</p> <p>Lag features are extremely useful for capturing&nbsp;<em>cycles</em>. In a time series, as cycles we define growths and decays in the target value that are not related to time, but mainly to previous target values. These fluctuations are not seasonal and their frequencies vary.</p> <p>To take into account such cycles we need to use the lag features. To visualize the serial dependence we can use&nbsp;<em>lag plots</em>. One of the most popular lag plots is the autocorrelation plot, which showcases the correlation between the target and one specific lag.</p> <p><a href="https://medium.com/mlearning-ai/time-series-forecasting-with-xgboost-and-lightgbm-predicting-energy-consumption-with-lag-features-dbf69970a90f">Visit Now</a></p>