Hybrid Model GSTAR-SUR-NN For Precipitation Data

Spatio-temporal model that have been developed such as Space-Time Autoregressive (STAR) model, Generalized Space-Time Autoregressive (GSTAR), GSTAR-OLS and GSTAR-SUR. Besides spatio-temporal phenomena, in daily life, we often find nonlinear phenomena, uncommon patterns and unidentified characteristics of the data. One of current developed nonlinear model is a neural network. This study is conducted to form a hybrid model GSTAR-SUR-NN to develop spatio-temporal model that has better prediction. This research is conducted on ten-daily rainfall data at 2005 2015 for Blimbing, Singosari, Karangploso, Dau, and Wagir region. Based on the results of this research, indicated that the accuracy of GSTAR ((1), 1,2,3,12,36)-SUR model used cross-covariance weight has relatively similar to GSTAR ((1), 1,2,3 , 12.36)-SUR-NN (25-14-5) for Blimbing and Singosari region with 5% error level. While Karangploso, Dau, and Wagir, GSTAR ((1), 1,2,3,12,36)-SUR-NN (25-14-5) model has better accuracy in predicting the precipitation at three locations with the value of R2prediction for each location is 0.992, 0.580, and 0.474.


INTRODUCTION
One of the spatio-temporal model that has been developed is Space-Time Autoregressive (STAR) which introduced by Pfeifer and Deutsch [1].STAR model did not fit for the data which had heterogeneous characteristics of locations.It was the STAR model's weaknesses and it can be addressed by the Generalized Space-Time Autoregressive (GSTAR) model and GSTAR-OLS that developed by Ruchjana [2], [3].The latest development of spatio-temporal models is GSTAR-SUR developed by Iriany [4] to address for non-stationary and seasonal pattern data.
The use of locations weights on the formation of spatio temporal models also contribute to the accuracy of the model.The location weights that often used are uniform weight, inverse distance, and normalized cross correlation weight [5], [6].The location weight consider the neighborhood between locations.For data that has a high variability, it is necessary to consider the location weight with variability aspects of observational data, cross covariance weights.The use of cross covariance weights have been studied and applied by Apanasovich and Genton [7] to predict pollution in California and Efromovich and Smirnova [8] to process the fMRI imaging with wavelate approach.
In addition to the phenomenon of spatio-temporal, in daily life we often find nonlinear phenomena.Time series models that have been explained are a time series model with a linear approach.There is many limitations in modeling with linear approach, especially is the fulfillment of the assumptions underlying the linear model.Linear time series modeling is not appropriate and difficult to do on the data with nonlinear pattern.Some nonlinear time series models have been developed and applied by Tong [9], Priestley [10], Lee et al. [11], as well as Granger and Terasvirta [12].Nonlinear time series model that most developed and applied is Artificial Neural Network.Therefore, this study was conducted to form a hybrid model GSTAR-SUR-NN to develop spatio-temporal models that have better forecasts.

Location Weight
The problems that arise in the modeling of space-time is the use of location weight.There are several methods to determine the weight location in space-time model of the application [6], such as : a. Uniform Weight Uniform weights can be calculated using the formula   = Inverse distance weight is calculated based on the actual distance between locations.The closer the distance between locations, the greater weight is.Thus, a location adjacent have greater weight.

c. Normalized Cross-Correlation Weight
This weight is the result of the normalization of cross-correlation between the location of the corresponding time lag [5].The normalization of cross-correlation weight was first introduced by Suhartono and Atok [6] and more applied research by [5].

d. Normalized Cross-Covariance Weight
The use of cross covariance weight had been studied and applied by Apanasovich and Genton [7] to predict pollution in California, as well as Efromovich and Smirnova [8]to process the fMRI imaging with wavelate approach.

GSTAR Model
Ruchjana [3] suggested that GSTAR was a generalization and extension of Space Time Autoregresssive models (STAR) by Pfeifer [1].The main difference are on spatial dependent and weight matrix.GSTAR more realistic because it is in fact more prevalent models with different parameters for different locations [13].GSTAR with p order and  1 ,  2 , …   spatial order, GSTAR ( 1 ,  2 , …   ) formulated as follows [14]: or

Seemingly Unrelated Regression (SUR)
Seemingly Unrelated Regression (SUR) is an equation that parameter estimation use General Least Square (GLS).Iriany (2013) explains that GLS is the regression coefficient estimator that count the relationship between the equation error.The error value is obtained from the estimated ordinary least squares (OLS) that will be used in the calculation to estimate the regression coefficients in the SUR system equation.SUR models with M equation expressed by:   =     +   ,  = 1, … ,  (8) where   is vector with size R×1,   's size is R×  dan   vector with size   × 1.
GLS method use the error variance : Cov() = ( ′ ) = σ 2  = .Matrix  describe the error correlation with : with   is identity matrix sized ( × ) and  is matrix sized ( × ) with   error variance from each equation for i= j and error covariance between equation for i≠ j.
The parameter model estimation is obtained by estimating parameter  in equation (8).

Hybrid Model GSTAR-SUR-NN
Hybrid model by integrating the neural network with conventional forecasting model was proved capable of producing more accurate forecasts.Some hybrid models using neural network are Feed Forward Neural Network for time series data [16], Auto Regressive Integrated Moving Average With Exogenous Factor-Neural Network (ARIMAX-NN) for data Inflation in Indonesia [17], and Neural Network -Multiscale Autoregressive (NN-MAR) for forecasting the number of tourists [18].
General Space-Time Autoregressive With Seemingly Unrelated Regression Neural Network (GSTAR-SUR-NN) is an integration/fusion (hybrid) between GSTAR model with neural network.GSTAR-SUR is used to obtain the most suitable NN architecture, so it can obtain the best forecasting performance.Architecture in the meaning is the amount of input variables that will be used in modeling NN.

RESULTS AND DISCUSSION
Based on the result of descriptive statistics analysis, we can describe the descriptive statistics of precipitation data for each location : Table 1 shows that precipitation of each location has a high variability.It can be seen that the standard deviation of precipitation of all location greater than the average.High variability indicates that there is fluctuation to the extreme point of precipitation, especially during the rainy season.Here is the result of homogeneity test of variance the precipitation data:  Based on table 2 above, indicated that at the time lag of 1-3, the lowest AIC is in 3rd time lag.Thus, the order GSTAR model is GSTAR ((1), 1,2,3)-SUR.Based on ACF of precipitation data at each location there is seasonal pattern in the time lag of 12 and 36.Therefore, the appropriate model is GSTAR ((1) (1,2,3,12 , 36)) -SUR.The architecture of GSTAR-SUR-NN model is as follows:  Based on the results of validation test of GSTAR ((1), 1,2,3,12,36)-SUR using cross covariance weight in Table 3, showed that, at α = 5%, GSTAR ((1),1,2,3,12,36)-SUR model has pvalue less than 0.05 which implies that there is a significant difference between average of actual precipitation data with average of predicted results.Or in other words, GSTAR ((1), 1,2,3,12,36)-SUR using cross-covariance weights still have low accuracy.While, GSTAR ((1), 1,2,3,12,36) -SUR-NN (25-14-5) was obtained p-value of 0.741.P-value is more than 0.05 implies that there is no significant difference between the average of actual precipitation with the average precipitation predicted results.Or in other words, GSTAR ((1),1,2,3,12,36)-SUR-NN (25-14-5) model have high accuracy.From this comparison, it has been proven that GSTAR ((1), 1,2,3,12,36)-SUR-NN (25-14-5) model has better accuracy rate in predicting the precipitation data.

Figure 1 .
Figure 1.The Result of Homogeneity Test of Variance

Table 1 .
Description of Precipitation Data in Five Location

Table 2 .
The Value of AIC with VARMAX Procedure

Table 3 .
The Result of GSTAR-SUR and GSTAR-SUR-NN Model Validation Test

Table 4 .
The Comparison GSTAR-SUR and GSTAR-SUR-NN Model Performance