Forecasting Customs Revenue Collection in Light of the Spread of the COVID-19 Pandemic using ARIMA Models and the Exponential Smoothing Methods in Libya

Forecasting future values of economic variables is one of the most critical tasks for governments, especially the values related to customs revenue collection are to be forecasted efficiently as the need for planning is great in this sector, because it is considered one of the sources of funding for the state's public treasury. The main objective of this research is to identify an appropriate statistical model for time series forecasting customs revenue collection during the current COVID-19 pandemic in Libya. The decision throughout this research is mainly concerned with ARIMA model, and Simple, Brown’s linear trend, exponential smoothing methods. The obtained data covers 108 observations, starting from the first week of the 6 th month of the year 2019 to the last week of the 8 th month of the year 2021.Based on the forecasting results of the current research, it was revealed that ARIMA (0,1,1) model offered more probabilistic information that improves forecasting the volume of customs revenue collection in light of the COVID-19 pandemic. According to this model, the research forecasts the new period in the next eight weeks or two months and finds that it will be increasing. In this research, ARIMA model and exponential smoothing methods are linear models based on the reactions to customs revenue collection due to the spread of the COVID-19 pandemic in the world. Furthermore, the forecasting performance between linear and nonlinear models can be compared in future studies.


Introduction
Over the past years, our beloved country, Libya, has gone through a clear decline in all economic sectors due to the civil war in the country which led to an increase in exchange rates and damage to the infrastructure.In addition to that, the COVID-19 pandemic (an infectious ARIMA models and exponential smoothing methods (EST) models are scientific methods of quantitative forecasting that have been widely used in many fields such as in forecasting.ARIMA models have been gradually adopted by many studies, the most prominent of which include Yule (1927), Slutzky (1927) and Box and Jenkins (1970).Yule developed an autoregressive (AR) model by analyzing sunspots data from Wolfer.Slutzky developed a moving average (MA) model by studying a series of discrete white noise signals.MA samples are sequences of serially uncorrelated random variables with zero mean and finite variance.Box and Jenkins combined the AR and MA models to develop the ARMA model and then proposed the integrated AR and MA (ARIMA) model under the assumption that the time series were stationary [3].The EST methods were originally developed by Brown (1963)[4], and Winter (1960)[5], among other researchers in the late of 1950s.This section presents a brief description of the methods.An observed time series is denoted by  1 ,  2 , … ,   .The forecast of  +ℎ made at time n will be denoted by  ̂(ℎ), where the integer ℎ is called the lead time or the forecasting horizon.The observed one-step-ahead prediction errors, namely,   =   −  ̂−1 (1) are used to evaluate past forecasts and modify future ones.
In terms of the spread of diseases, the ARIMA models and EST models have been utilized in the past to forecast severe disease outbreaks for example, Mouth-Foot-Hand disease Morbidity in China Since the beginning of coronavirus in late 2019, scholars have been learning the pattern and level of its infection using various mathematical modeling approaches simplified in epidemics research.In [14], developed forecast model using artificial neural network (ANN) to approximate the evolution of coronavirus cases worldwide based on 2-week past data and geolocation.This research compared the forecasted numbers created by their model with the real values and discovered that it is closely matched.In [15], artificial network (NN) model for coronavirus spread forecast is proposed.The forecast model collects using NA dam training model and made forecasts for various countries and areas around worldwide.The proposed model attained highly precise results be around 87.7% for most regions.
Several scholars have used the EST and ARIMA model to prediction the spread of COVID-19 in numerous countries.In [16], researchers have investigated and evaluated the accuracy of an ARIMA model over a relatively long period of time using Kuwait as a case study.In [17], the scholars gained Johns Hopkins epidemiological data to prediction coronavirus incidence and prevalence.In [18], developed a model to estimate the dispersal point of coronavirus pandemic in Sudan.Similarly, in [19], a model using the Box-Jenkins and Exponential Smoothing Methods for forecasting the number of cases of the COVID-19 pandemic cases of the selected countries of G8 countries, Germany, United Kingdom, France, Italy, Russian, Canada, Japan, and Turkey.The researchers in [20] used ARIMA to predict the COVID-19 pandemic in Indonesia.In [1], ARIMA was used to forecasting the spread of the COVID-19 pandemic in Saudi Arabia under current public health interventions.In [21], ARIMA was appied to prediction COVID-19 infections, and deaths, as well as the impact on the Chinese economy in compare with SARS virus.
Although the ETS and ARIMA model were applied for forecasting future coronavirus; some scholars believe that the time series an ARIMA model is not fitting for non-linear relations especially in a dynamic and complex issue [22] .Nevertheless, the accuracy of an ETS and ARIMA model-based forecast should be inspected further.Altogether ETS and ARIMA methods-based model forecasts use accuracy measurements for choosing a best-fit forecast model, nevertheless, the prediction values will not similar the actual values observed for the equal time period.This can be due to numerous factors for example, the different limitations that are mandated by countries to limit the spread and the extent to which the public cling to these limitations.Therefore, the main purpose of this research is to forecast the CRC in the light of speared the COVID-19 pandemic in Libya.Comparing the proposed methods with each other experimentally and then choosing the best prediction model using the appropriate statistical criteria.To our knowledge, no studies have been found that dealt with the same aforementioned topic with the same statistical methods.

I. Data collection.
This research is conducted using data on CRC.The obtained data covers 108 observations, starting from the first week of the 6th month of the year 2019 to the last week of the 8th month of the year 2021.The source of this information was Libyan Customs' Department of Statistics based on Libyan dinar.The graphical plot of the series is presented in Figure 1.

II. Evaluation of the forecasting performance indices.
A very common accuracy measurement functions are used to assess the performance of each model described below, these performance functions are: where  2 is the variance of error,  is the number of parameters, and  is the number of observations.

 Bayesian Information Criteria (BIC)[23]
where  2 is the proportion of the variance explained by the model [24].

I. Stationarity test:
A time series is a sequence of observations on a variable that is typically made at equally spaced intervals over time.A time series is covariance stationary (weakly or simply stationary) if its mean and variance are constant and its covariance depends only on the distance or lag between two periods and not on the actual time when the covariance is calculated [28][29][30].A time series must be stationary to model it using ARIMA models and exponential smoothing models.OLS regression is frequently used to estimate the coefficients of the models.The use of OLS relies on the stochastic process being stationary.When the stochastic process is nonstationary, the use of OLS can produce invalid estimates.Granger [31] called such estimates 'spurious regression' results, i.e., they have high  2 values and t-ratios, but no economic meaning.This research performs the ADF and PP unit root tests of stationarity, which removal structural effects (autocorrelation) in the time series.This research depends also on the autocorrelation function (ACF) and partial ACF (PACF) to test the stationarity of their data.The autocorrelation function (ACF) also exhibits a pattern for a nonstationary series, with a slow decrease in autocorrelation size.Six examples of such series are provided in Figure 2. Two-time series in this figure are white noise, whereas the others are not.That is, (a) and (b) are white noise, whereas (c), (d), (e) and (f) are not.The autoregressive model has an order, , which determines how many previous values must be included in the difference equation to estimate the current value.A difference equation relates a variable   at time  with its previous values.AR produces white noise from Equation (6) as following:

Journal of Alasmarya University: Basic and Applied Sciences
where   is the dependent variable,   is the coefficient value of the AR parameter of order  ( = 1, … , ),  0 is the intercept value, and   is a random disturbance assumed to be distributed as (0,  2 ).
By using the difference equation, the value of   can be obtained from  −1 , the value of  −1 can be obtained from  −2 and so on.
For the AR model (), the coefficients ∅ 1 , ∅ 2 , … ., ∅  can be determined either via OLS regression or MLE.The MLE results are consistent, asymptotically normal and asymptotically equivalent to those of the OLS estimators.Equation ( 7) presents an MA model for forecasting historical data on   (dependent variable) and the forecast errors in   [29,30,33].Moving average MA (q): A time series is influenced by random shocks in noise.As a result, the current value of the series is affected by random shocks appearing in previous values.The moving average terms are used to capture the influence of previous random shocks on the future value.Equation ( 7) illustrates a MA model in forecasting the historical data on   (dependent variable) and the forecast errors in   .
where   is the coefficient value of the MA parameter of order ( = 1, … , ).
For an MA (q) model, MLE is used to determine the model coefficients  1 , . .,   [34] Auto-Regressive Moving Average ARMA (p,q): ARMA models are used when the series is partly autoregressive and partly moving average.ARMA (p, q) observes ARMA processes of orders  and , respectively.The autoregressive parameter (∅) and the moving-average parameter () have important effects on the detection ability of ARMA.ARMA (, )combines Equations ( 6) and (7) to form the following ARMA model of order: MLE determines the model coefficients ∅ 1 , … .∅  and  1 , … …   for an ARMA (p, q) model [3,35].ARIMA (p,d,q) Model: The ARIMA model comprises three processes, namely, (1) the AR (p) process that accounts for the memory of past events, (2) an integrated process I(d) that allows the data to remain stationary and (3) the MA (q) process that accounts for a finite sum of forecasting error terms.

Simple Exponential Smoothing (SES):
The SES model is based on the premise that the level of time series should fluctuate about a constant level or change slowly over time [36].As stated by [37], the simple exponential smoothing is easily applied, and it produces a smoothed statistic as soon as two observations are available.This model is appropriate for a series where no trend or seasonality exists.Let the observed time series up to time period t upon a variable y be denoted by  1 ,  2 , … .,   .Assume that I want to determine the forecast of   of the next value  +1 of the series, which is yet to be observed.Given data up to period of time t-1, the forecast for the next time t is denoted by  −1 .When the observation   becomes available, the forecast error is expressed as   −  −1 .
As mentioned, the forecast for the next period using the forecast error is calculated by taking the forecast for the previous period and adjusting it because of the method of simple or single exponential smoothing.The simple exponential smoothing has a single-level parameter and can be described by the following equations: Where  is the level smoothing weight that lies between 0 and 1,   is the old smoothed value or forecast for period t,   is the new observation or actual value of the series in period t, and  ̂() is the forecast for  periods ahead, i.e., the forecast of  + for a certain subsequent time period  +  based on all data points up to time period t.The ARIMA model equivalent to the simple exponential smoothing model is the ARIMA (0, 1, 1) model with zero order of autoregressive, one order of differencing, one order of moving average, and no constant [38,39].

Brown's linear Trend (BST):
This model is appropriate for a series in which a linear trend and no seasonality are observed.Its smoothing parameters are level and trend, which are assumed to be equal.Thus, Brown's model is a special case of Holt's model.Brown's exponential smoothing has level and trend parameters and can be described by the following three equations: ̂() =   + (( − 1) +  −1 ) Where   is the exponentially smoothed value of y t at time t,   is the double exponentially smoothed value of   at time t,  is the smoothing constant (0 <  < 1), and  ̂() is the forecast for period t.The ARIMA model equivalent to the linear, exponential smoothing model is the ARIMA (0, 2, 2) model [39].

Results
The first step in developing the exponential smoothing models and ARIMA models is to check the stationary pattern of the time series.The obtained data covers 96 observations as shown in Figure 1.Corresponding to the sequence of value for a single variable in ordinary data analysis, each case (row) in the data represents an observation at different times.The observations must be taken at equally spaced time intervals.Based on Figure 1, the dataset does not have stationary invariance, that is, a natural logarithm transformation is required to give the dataset a constant variance.And also, the dataset is not stationary in mean; that is, a difference is required to provide a constant mean.Based on Figure 3, our conclusion is that the data have been transformed into a stationary pattern.

Figure 4. ACF and PACF of residuals of Customs Revenue Collection.
Based on Figure 4, ACF and PACF for the residual errors are non-significant, implying that the proposed model is suitable for the nature of data.The research reported in this paper compares all above mentioned models by using various steps to test the significance of the estimated parameters and measures the forecasting error.Among the exponential smoothing models assessed in the present research, the identified optimal model is Brown's linear Trend, where the value of the coefficient of determination is larger than the Simple Exponential Smoothing model, and also the AIC, and BIC values are slightly smaller than the Simple Exponential Smoothing model.Add to that, the model shows that the Ljung-Box test provides a non-significant P-value, thereby indicating that the residuals appeared to be uncorrelated.As shown in Table 3, all parameters in the first two models are significant, whereas, for the rest of the models, at least one parameter is not significant.The selected model also approximately fulfills the basic criteria of model selection with a minimum value of AIC, and BIC, with a high correlation of coefficient and non-significant Ljung-Box.Among the ARIMA models assessed in the above table, the identified optimal model is (0,1,1) model, where the value of the coefficient of determination is slightly larger than the first model, and also, the AIC, and BIC values are slightly smaller than the (1,1,0) model.Add to that, the model shows that the Ljung-Box test provides a non-significant P-value, thereby indicating that the residuals appeared to be uncorrelated.

Comparison of forecasting performance of different models
As shown in Table 5, the two models, namely, the BLTES model and (0,1,1) model are compared.The comparison among these models focused on various measures of error.The results of the forecasting performance of these two models are summarized in Table 5    author interpreted and discussed the relevant issues according to the results illustrated in Table 5 and Figure 5.

Discussion
The results presented in Table 5 revealed that the U and predicted R 2 values of ARIMA (0,1,1) model are 0.04145 and 0.96763, respectively, for the time series of the CRC in Libya.Such results clearly indicate that the U value is lower than those of the other method, the predicted R 2 value is higher than those of the other method.Based on that, the ARIMA (0,1,1) model achieved the best performance among all models because its fit was the best.The ACF and PACF of the residuals are presented in Figure 6.After fitting the model, the residuals should only be white noise to obtain a good forecasting model.In examining the residuals, insignificant values are expected for these statistics.As shown in Figure 6, the ACF and PACF of the residual errors are insignificant, this means that the ARIMA (0,1,1) model is the most suitable for forecasting the Libyan's CRC.
Using the (0,1,1) ARIMA model to forecast the CRC: this model can be used to predict the forecasts for future values of the time series.Figure 6 presents the actual values for the period from the 1 st week of the 6 th month of 2019 to the last week of the 8 th month of 2021 and the predicted values for the next 8 weeks or two months using our ARIMA (0,1,1) model.

Figure 7. Forecast for the production of Libya's customs revenue collection
The selected model demonstrates excellent performance as reflected in its explained variability and predictive power.Therefore, the results of the model show an increase in the values of CRC in Libya during the next 8 weeks or two months.This result supports the findings in [2] which saying that, the following of the World Customs Organization's (WCO) instructions will lead to mitigating the effects of the Covid-19 pandemic.The researchers in [16] have stated that, epidemics are inherently unpredictable.This imposes limitations on the use of the ARIMA model, which belongs to a class of linear models.The ARIMA model cannot capture hidden nonlinear patterns in a time series.One of the main threats to the validity of our results is that there are some other effects on CRC in only a part of the time series under research, such as the civil war, as well as fluctuations in the exchange rate.

CONCLUSION
The present research has proposed and evaluated the methods of forecasting the customs revenue collection (CRC) in light of spread of the COVID-19 pandemic in Libya.The proposed models, that are, the exponential smoothing (SES and BLTES) and ARIMA models were evaluated by comparing them to each other based on the time series of CRC in Libya.This reserach has a useful contribution to the literature because it represents the first empirical research that applied the ARIMA models in this research area.The obtained results have provided evidence of the importance and the value of such ARIMA models as a powerful forecasting method which improves the accurate prediction of the value of CRC and enhances forecasting methods in the Libyan context.As observed from the results' increase in the CRC, the author recommends that the Libyan customs continue to apply the instructions of the World Customs Organization (WCO) to significantly reduce the effects of COVID-19 pandemic.Future research would benefit from the results in this research by focusing on other methods by using the data from a broad sample of CRC in Libyan context and by comparing their findings with this research' results.
Figure 1: Time Series of CRC

Figure 3 .
Figure 3.Time series after transforms (Natural log) and taking differencing of order one.

Figure 5 :
Figure 5: Results of the comparison of the forecasting performance among the different models.

Table 1 :
Results of testing the significance of the estimated parameters of the EST models

Journal of Alasmarya University: Basic and Applied Sciences As
shown in Table1, all parameters in the models are significant.The selected model also approximately fulfills the basic criteria of model selection with a minimum value of BIC, and AIC, with a high correlation of coefficient and non-significant Ljung-Box.

Table 2 :
Comparative results from various exponential smoothing models for (CRC).

Table 3 :
Results of testing the significance of the estimated parameters of the ARIMA models

Table 4 :
Comparative results from various ARIMA models for CRC.

Table 5 :
Statistical measures of forecast error for the CRC in Libya
, et al., Forecast of the trend in incidence of acute hemorrhagic conjunctivitis in China from 2011-2019 using the Seasonal Autoregressive Integrated Moving Average (SARIMA) and