An ARIMA model is an equation relating the present value, at time t, of an observed series to past values of the series. The principal parts of the model are terms specifying autoregressive (AR) and moving -average (MA) components, differencing of the time series, and an innovation series.
The model is defined in three parts: the orders, which are the number of lagged values (past values) that appear in the model; the parameters, which are the Box-Cox parameter, constant, innovation variance and coefficients associated with the AR and MA components; and values for the lags, which by default are just the numbers 1,2,…, etc.
In Genstat, you specify an ARIMA model by giving the orders required. The parameters are then estimated from the data. In command mode, it is possible to specify values for the parameters, for example when you wish to forecast from a known model, without parameter estimation, and to specify other values for the lags.
Seasonal models
To account for seasonal effects in the data, for example when there are long-term cycles in the data, you can extend the ARIMA model by adding another set of orders and specifying the seasonal period. For example, if data is collected hourly over several days you would set the seasonal period to 24.
In command mode you can specify more complex models using multiple seasonal periods with sets of orders for each period.
Notes
- Autoregressive: an autoregressive process of order p is denoted AR(p).
- Moving-average: a moving-average process of order q is denoted MA(q)
- Differencing: a method commonly used for filtering time series to remove trend, so obtaining stationarity. The value at time t of the output series is obtained by taking the input value at time t and subtracting the input value at time t-1. Usually first-order differencing is sufficient, but occasionally higher order differences are required, by applying the difference operation to the differenced series.
- Innovation series: n unobserved series forming part of the ARIMA model. It can be considered as being the error in predicting the observed series at each time t. It is usually modelled as a series of independent Normal deviates with mean 0 and constant variance.