Forecasting
Forecasting, in simpler terms, is a process of predicting future values of a variable based on past data and other variables that are related to the variable being forecasted. For example, values of future demand for tickets for a particular airline company depend on past sales and the price of its tickets.
Time-series data is used for forecasting purposes. According to Wikipedia ‘A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus, it is a sequence of discrete-time data.’ An example of time series data for monthly airline passengers is given below:
MacKinnon approximate p-value for Z(t) = 0.0000
Box-Jenkins Approach
Exponential Smoothing method
It is one of the most popular classic forecasting models. It gives more weight to recent values and works best for short term forecasts when there is no trend or seasonality in dataset. The model is given by:
Ŷ(t+h|t) = ⍺y(t) + ⍺(1-⍺)y(t-1) + ⍺(1-⍺)²y(t-2) + …
with 0<⍺<1
As observed in the model, recent time periods have more weightage in the model and the weightage keeps decreasing exponentially as we go further back in time.
The ⍺ is the smoothing factor here whose value was chosen to be 0.9 since it had the lowest RMSE among all other values.
The forecast results are presented below:
Holt Winters’ method
The simple exponential function cannot be used effectively for data with trends. Holt-Winters’ exponential smoothing method is a better suited model for data with trends. This model contains a forecast equation and two smoothing equations. The linear model is given by:
yt+h = lt + hbt
lt = αyt + (1-α)lt-1
bt = β(lt-lt-1)+ (1-β)bt-1
where, lt is the level (smoothed value).
h is the number of steps ahead.
bt is the weighted average of the trend.
Just like the simple exponential smoothing method, lt shows that it is a weighted average of yt
The α is the smoothing factor here whose value was chosen to be 0.99 and the β value 0.0025 since they had the lowest RMSE among all other values.
The forecast results are presented below:
Evaluation
To compare the models the two parameters chosen are:
From the table it is clear that ARIMA/Box Jenkins method has both the lowest RMSE and MAE among the models under consideration while Exponential smoothing method has the highest MAE and RMSE among all.
Therefore, the unemployment rate forecasts as per the Box Jenkins method for the next four months are:
The way ahead?
Figure 1
More technically, it is modelled through a stochastic process, Y(t). In a time series data, we are interested in estimating values for Y(t+h) using the information available at time t. Unemployment rate Unemployment is the proportion of people in the labour force who are willing and able to work but are unable to find work. It is an indicator of the health of the economy because it provides a timely measure of the state of labour market and hence, overall economic activities. In wake of the impact of Covid-19 on economic activities throughout the world, unemployment rate analysis and forecasts have become paramount in assessing economic conditions. In India, unemployment rates have been on the higher end in recent times. According to data released by Statistics Ministry, unemployment rate for FY18 was 6.1%, the highest in 45 years. It is no co-incidence that GDP rates have also been declining successively for the past few years. The shock that Covid-19 has given to the economy has only worsened our situation. The unemployment rate rose to 27.1% as a whopping 121.5 million were forced out of work.Figure 2
Source: CMIE Methodology The data used to forecast unemployment rates was sourced from CMIE website, which surveys over 43,000 households to generate monthly estimates since January 2016. The data has 56 monthly observations ranging from January 2016 to August 2020, data before 2016 was not available. Four popular econometric forecasting models (ARIMA, Naïve, Exponential Smoothing, Holt’s winter method) were used and the best performing model was chosen to forecast unemployment till December 2020. The forecasting models were programmed in R. The relevant codes are available upon request with the author. The Dicky-Fuller test and the Chow test for structural breaks were conducted using STATA, results of which are presented further in the article. Before beginning the analysis, I believe that the limitations of the analysis should be mentioned:- The sample size of 56 observations is not sufficient for a thorough analysis, ideally the sample size should have been 2-3 times larger than the available data. Smaller sample sizes lead to skewed forecasting results which are prone to errors.
- The unemployment data from CMIE is an estimate and is a secondary source. In India, primary data is only collected once in 3-4 years, thus the forecasting results are only as good as the source of the data.
- This is a univariate analysis, an Okun’s law based analysis of Unemployment rate as a function of GDP (output) and past trends would have been more suitable. However, since GDP data is only available quarterly and there are only 56 monthly observations available, it would have rendered the analysis insignificant with only 19 quarterly observations.
- Forecasting being based on past trends, is prone to errors. The negative shock provided by Covid-19 to the economies worldwide has made it all the more difficult to forecast. A Bloomberg study analysed over 3,200 forecasts by IMF since 1999 and found that over 93% of the forecasts underestimated or overestimated the results with a mean error of 2 percentage points.
Dicky-Fuller test on raw data
Table 1
—– Interpolated Dickey-Fuller —– | ||||
Test statistic | 1% critical value | 5% critical value | 10%critical value | |
Z(t) | -2.303 | -3.576 | -2.928 | -2.599 |
MacKinnon approximate p-value for Z(t) = 0.1709
Converting the non-stationary series into stationary
In order to transform the non-stationary series into stationary, we use differencing method (computing difference between consecutive observations). We plot the data over time, ACF and PACF again as shown in Figure 5 below and figure 10 and 11 in appendix, respectively. From the figures, we can intuitively say that the transformed series is stationary. Further, we used Augmented Dickey-Fuller tests to ascertain the stationary of our series. Table 2 shows the result of the ADF test. The test statistic is significant at 1,5 and 10 per cent levels and the p-value is less than 0.05. We reject the null hypothesis of non-stationarity of our series. The tests confirm that the series is stationary.Dicky-Fuller test on first difference data
Table 2
—– Interpolated Dickey-Fuller —– | ||||
Test statistic | 1% critical value | 5% critical value | 10%critical value | |
Z(t) | -5.035 | -3.576 | -2.928 | -2.599 |
Figure 3
Naïve model Naïve models are the simplest of forecasting models and provide a benchmark against which other more sophisticated models can be compared. Thus, a Naïve model serves as an ideal model to start any comparative analysis with. In a naive model, the forecasted values are simply the values of the last observation. It is given by y^t+h|t=yt. Forecast results from Naïve method are presented below in figure 4 and table1.Figure 4
Table 1
Point forecast | Lo 80 | High 80 | Low 95 | High 95 | |
Sept | 8.35 | 4.861900 | 11.83810 | 3.0154109 | 13.68459 |
Oct | 8.35 | 3.417081 | 13.28292 | 0.8057517 | 15.89425 |
Nov | 8.35 | 2.308433 | 14.39157 | -0.8897794 | 17.58978 |
Dec | 8.35 | 1.373799 | 15.32620 | -2.3191783 | 19.01918 |
- Identification of ARIMA (p, d, q) model
- Forecasting
Figure 5
Table 2Point forecast | Lo 80 | High 80 | Low 95 | High 95 | ||
Sept | 9.04 | 5.978858 | 11.93987 | 4.401073 | 13.51765 | |
Oct | 9.77 | 5.183039 | 14.1951 | 2.797671 | 16.58054 | |
Nov | 10.3 | 5.364191 | 15.06267 | 2.797157 | 17.62971 | |
Dec | 10.3 | 5.280182 | 15.14668 | 2.668678 | 17.75819 | |
Figure 6
Table 3Point forecast | Lo 80 | High 80 | Low 95 | High 95 | |
Sept | 8.30 | 4.739288 | 11.87260 | 2.8512134 | 13.76068 |
Oct | 8.30 | 3.507498 | 13.10439 | 0.9673541 | 15.64454 |
Nov | 8.30 | 2.532806 | 14.07908 | -0.5233096 | 17.13520 |
Dec | 8.30 | 1.700403 | 14.91149 | -1.7963595 | 18.40825 |
Figure 7
Table 4Point forecast | Lo 80 | High 80 | Low 95 | High 95 | ||
Sept | 8.34 | 4.749288 | 11.9326 | 2.84121 | 13.84 | |
Oct | 8.33 | 3.24 | 13.4243 | 0.54541 | 16.11977 | |
Nov | 8.32 | 2.0800 | 14.5678 | -1.2253 | 17.87316 | |
Dec | 8.31 | 1.0963 | 15.53419 | -2.725103 | 19.35565 | |
- Root mean square error (RMSE)
- Mean absolute error (MAE)
Naive | ARIMA | Exp Smoothing | Holt Winters’ | |
RMSE | 2.72 | 2.24 | 2.73 | 2.7 |
MAE | 1.05 | 1.034 | 1.06 | 1.05 |
Sept | 9.04 |
Oct | 9.77 |
Nov | 10.3 |
Dec | 10.3 |
- The unemployment rate is expected to rise in the coming months. This is a bad sign for an economy that is already suffering.
- With GDP forecasts getting lower and lower for the current financial year, the govt needs to act quick to mitigate the potential damage.
- It is impossible to correctly ascertain the total impact of covid-19 on the economy and the range of the impact, but it is safe to say that we will be seeing the effects for a long time to come in some form or other.
- We might see more and more people slip into poverty, depression, increased domestic violence and with potentially long term impact on human development parameters like child mal-nutrition, enrolment rates etc among other things.
- Expansionary monetary policy: It is a common tool of dealing with high unemployment rate in the short term. Under expansionary monetary policy, the central bank reduces the rate of interest on which it lends money to the banks, subsequently the banks lower their rates which leads to a higher amount of loans being taken by business owners. This extra capital helps businesses to hire more workers and expand production, which in turn reduces unemployment rate.
- Expansionary fiscal policy: Under expansionary fiscal policy the government increases its spending, particularly in the infra-structure sector. It spends more money to build dams, roads, bridges, highways etc. This increased spending leads to an increase in employment as these projects require labour.
- Expand the scope of NREGS to urban areas permanently and a higher minimum wage for all : NREGS has proved to be really effective in alleviating poverty, improving quality of life and decreasing unemployment rate in rural areas. Given the unprecedented circumstances, the govt can consider expanding its scope to urban areas, so that it could provide employment to the millions of unemployed workers there. This increase in expenditure could also help the govt revive consumer demand, which is essential if we want to help the GDP get back on track.
- A stimulus package aimed at putting money into the hands of the poor :
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
2 comments
I am a University student pursuing Msc statistics currently. The reason behind this mail is to request an R code used in this article(https://www.thepeninsula.org.in/2020/10/21/forecasting-unemployment-rate-during-the-pandemic/).I hope you can help me out with this request. I must fetch the above to fulfill the present study. Please, I would request you to help me with this chore.
Via e-mail: maneeshamanoharan1999@gmail.com
Yours Sincerely
Maneesha M
Maneesha,
We are fully supportive of your request and to help you we have sent you the author, Mohit Kumar’s contact details. We do hope you have been able to get your necessary information. Our best wishes for your study – TPF Editorial Team