Estimation of Geographically Weighted Regression Case Study on Wet Land Paddy Productivities in Tulungagung Regency

Regression is a method connected independent variable and dependent variable with estimation parameter as an output. Principal problem in this method is its application in spatial data. Geographically Weighted Regression (GWR) method used to solve the problem. GWR is a regression technique that extends the traditional regression framework by allowing the estimation of local rather than global parameters. In other words, GWR runs a regression for each location, instead of a sole regression for the entire study area. The purpose of this research is to analyze the factors influencing wet land paddy productivities in Tulungagung Regency. The methods used in this research is GWR using cross validation bandwidth and weighted by adaptive Gaussian kernel function. This research using four variables which are presumed affecting the wet land paddy productivities such as: the rate of rainfall(X1), the average cost of fertilizer per hectare(X2), the average cost of pesticides per hectare(X3) and Allocation of subsidized NPK fertilizer of food crops subsector(X4). Based on the result, X1, X2, X3 and X4T has a different effect on each District. So, to improve the productivity of wet land paddy in Tulungagung Regency required a special policy based on the GWR model in each district.


INTRODUCTION
The conventional spatial analysis techniques use a single equation to assess the overall relationships between the dependent and independent variables across space, known as a global analytic approach.One important assumption underlying this approach is that the relationships of interest are stationary or homogeneous spatially.While the global perspective is effective in handling spatial dependence and generating less unbiased estimates (than the non-spatial modeling), it is not capable of exploring spatial non-stationarity (or heterogeneity) or identifying place-specific associations [1].
Geographically weighted regression (GWR) is a local spatial statistical technique used to analyze spatial non-stationarity, defined as when the measurement of relationships among variables differs from location to location.Unlike conventional regression, which produces a single regression equation to summarize global relationships among the explanatory and dependent variables, GWR generates spatial data that express the spatial variation in the relationships among variables [2].This approach includes locational information and smoothing techniques into regression models.In contrast to the global approach, GWR has proved to be a useful local spatial analysis tool that helps researchers to generate nuanced insights into existing literature [1].In this research, the parameter estimation on the GWR method requires a weights matrix that calculated by adaptive gaussian kernel function.The data weighting is according to the proximity of the -th observation location.Cross validation is used for estimating the kernel bandwidth.The case study investigated the productivity of wet land paddy in Tulungagung Regency, there are 18 districts in Tulungagung Regency and 1 district has no wet land paddy productivity.This research using four variables which are presumed affecting the wet land paddy productivities such as: the rate of rainfall(X1), the average cost of fertilizer per hectare(X2), the average cost of pesticides per hectare(X3) and Allocation of subsidized NPK fertilizer of food crops sub-sector(X4).The final result of this GWR method will be obtained productivity model of wet land paddy in Tulungagung regency.In addition, the mapping of paddy productivity per district is expected to be useful and add information especially in agriculture to increase wet land paddy productivity in Tulungagung Regency.

METHODS
Geographically weighted regression is an extension of the traditional multiple linear regression toward a local regression, in which regression coefficients are specific to a location rather than being global estimates.the specification of a basic GWR model is: where yi is the dependent variable at location i,   is the value of the kth explanatory variable at location i, the   (  ,   )  is the local regression coefficient for the kth explanatory variable at location i,  0 (  ,   ) is the intercept parameter at location i, and   is the random disturbance at location i, which may follow an independent normal distribution with zero mean and homogeneous variance [3].
To facilitate the exposition, it is convenient to express the GWR model in matrix notation: Weighted Least Square (WLS) is used for Geographically Weighted Regression parameter estimation, so: If For each i-th point, the Geographically Weighted Regression model parameter estimation is performed by matrix operation: with, The first step to estimate parameters in GWR, it is important to decide the spatial weighting matrix, which can be calculated by different methods.One method is to specify Wij as a continuous and monotonic decreasing function of distance dij between points i and j.For adaptive kernel size, the weight of each point can be calculated by applying the Gaussian function [3]: where   (  ,   ) is the weight of location j in the space at which data are observed for estimating the dependent variable at location i, andℎ () is referred as a bandwidth.  is eucledian distance between points i and j,   = √(  −   ) 2 + (  −   ) 2 .Bandwidth is used to specifies how the extent of the kernel should be determined.It controls the degree of smoothing in the model.Crossvalidation (CV) is an iterative process that searches for the kernel bandwidth that minimizes the prediction error of all the y(s) using a subset of the data for prediction [1].
where  ̂≠ (ℎ) is the predicted value of observation  with calibration location  left out of the estimation dataset.
The second step is testing for spatial heterogeneity.Spatial heterogeneity indicates the variation between location.So, each location has different relationship structures and parameters.Spatial data heterogeneity can be tested using the Breach-Pagan (BP) test [4]: H0: Where   is a diagonal element of the CC T matrix, with  = (  (  ,   )) −   (  ,   ).
H1: at least one ( , ) has a relation with location ( , ) ii uv The statistic test is: where:  I -L)

RESULTS AND DISCUSSION
The results of Breusch-Pagan (BP) test in wet land paddy productivities in Tulungagung Regency is:

Danang Ariyanto 12
Because BP test >  2 (0,05; 5) = 11,070, the decision is Reject  0 .There is spatial heterogeneity in wet land paddy productivities in Tulungagung Regency, so we can use geographically weighted regression to estimate parameter model wet land paddy productivities in Tulungagung Regency.
The next step is to determine the parameter estimation of the GWR model based on the equation 7. Table 1 show the parameters estimation GWR model.The GWR model for district Besuki based on Table 1 can be written as follows: ̂1 = 26,553 -0,0217x1 + 0,0000165x2 + 0,0000363x3 -0,00150x4 Other GWR models for the other districts has the same way as in equation 8 based on Table 1.Testing parameters of GWR model simultaneously conducted to determine the effect of weighting in the process of parameter estimation on the case of paddy productivity in Tulungagung Regency.Results of simultan parameter test based on equation 12. F test is 8,904.Value of F test is greater than F (0,05;10,3) = 8,785.This shows that simultaneously the predictor variables X1, X2, X3 and X4 have significant effect spatially on response variable.The GWR model of each district is formed on the basis of influential parameters, so partial parameter testing is performed by Table 2  Based on Table 2, The Farmers need to add the average cost of pesticides per hectare(X3) to each district because the effect is positive significant to the wet land paddy productivity.In the districts of Ngantru, Karangrejo and Gondang need to add the average cost of fertilizer per hectare(X2) because it has a significant positive effect on paddy productivity.The addition of X2 and X3 is not continuously.There are 12 districts that need to be added by the allocation of subsidized NPK fertilizer of food crops sub-sector(X4) because it proved to have a significant positive effect on wet land paddy productivity.The district is grouping according to the variables that significantly affect the GWR model, the result is listed at Tabel 3. Based on Table 3, a spread pattern of variables that has a significant effect on paddy productivity by districts in Tulungagung Regency presented in Figure 1.Tulungagung Regency Danang Ariyanto 14 The yellow group is the group where the four variables, X1, X2, X3 dan X4 is significant to the productivity of wet land paddy.The yellow group consisted of 9 districts.Only Variable X2, X3 and X4 had a significant effect on the red group, there were 3 districts in the red group.Purple group is a group with 2 variables that have significant effect that is X2 and X3.There are 3 districts that enter the purple group.The last group is green group where only X3 has significant effect.Green group consists of 3 districts.The District of Tanggunggunung has no wet land paddy productivity so the colour is white.

Figure 1 .
Figure 1.The spread pattern of variables that significantly affected by district on wet land paddy productivities in Tulungagung regency Estimation of Geographically Weighted Regression Case Study on Wet Land Paddy Productivities inTulungagung Regency , ) ,   is a vector of OLS residuals,   is the variance based on OLS residual, T= Trace [ W T W + W 2 ] and  is an N by (k+1) matrix of normal standard score (z).
Estimation of Geographically Weighted Regression Case Study on Wet Land Paddy Productivities inTulungagung Regency

Table 1 .
Parameters estimation GWR model

Table 2 .
using t test.Tulungagung Regency t test statistics for parameter of GWR model

Table 3 .
The Grouping according to the significant variables of GWR model