Spatial Analysis of Dengue Disease in Jakarta Province

,


INTRODUCTION
The female Aedes aegypti mosquito bite is the primary method of transmission for the dengue virus, which causes dengue disease.These mosquitoes can breed in slum places such as there are pools of water that are not taken care of, dark, and damp [1].Dengue disease is still one of the main health problems that threaten the Indonesian people because Indonesia is a tropical country that is vulnerable to vector-borne diseases, that is diseases that increase the likelihood and risk of occurrence due to changes in the weather cycle [2].
Dengue virus is highly sensitive to changes in average temperature, humidity, and increased rainfall which can affect the life cycle and reproduction of the Aedes aegypti mosquito that carries the virus [2], [3].And Indonesia is a tropical country with changes in two seasonal cycles, that is the dry season and the rainy season, so that dengue disease will increase during the rainy season because the weather conditions become humid and waterlogging often occurs due to water channels that do not flow or post-floods that are not cleaned immediately.
Based on research [1], slum areas have more potential to become a breeding ground for Aedes aegypti mosquitoes, so it is necessary to have a program to eradicate mosquito nests in slum areas to eradicate mosquitoes that carry the dengue virus to reduce dengue disease cases.Meanwhile, according to research [4], areas that often flood due to high rainfall have a vulnerability in the health sector, that is dengue disease because of climate change.So, it can be concluded that flooding has an indirect effect on dengue disease through climate change.
Several studies show that dengue disease is related to mobility and population density and people's living behavior.As done by [5], [6] shows that there will be a rise in dengue disease cases because of increased population density.This shows that dengue disease spreads more easily in areas with a high population density.Mosquitoes live in the tropics with warm temperatures, in areas below 1,000 meters sea level in Indonesia [7].Jakarta, in particular, is located in a region with ideal conditions for mosquito breeding.
In 2019, the province with the highest population density in Indonesia which reached 15,328 people per km 2 was Jakarta [8].Jakarta Province is also the highest percentage of urban slum households (the lowest 40 percent of the population) which reached 42.73 percent.Urban slum households are defined as households that: do not have access to safe drinking water sources; do not have access to proper sanitation; do not have access to floor area >= 7.2 m 2 per capita; and do not have access to proper roof, floor, and wall conditions [9].
Jakarta Province is vulnerable to the transmission of dengue disease due to high population density, percentage of urban slum households, some characteristics that may flood if the rainfall is high, and an optimum temperature for the breeding of the Aedes aegypti mosquito.This is evident in 2019 the dengue disease morbidity rate per 100,000 population Jakarta Province is the top 10 provinces in Indonesia which reached 82.45 [10].
Research on the spread of dengue disease with a spatial approach has been carried out by [11] using the Moran's I method and the Local Indicators of Spatial Association (LISA) which shows that dengue disease cases, population, population density, temperature, rainfall, and wind speed have positive spatial autocorrelation between villages in Padang municipality on the six variables.Furthermore, research [12] using the Moran and Geary's C Index method shows that dengue disease transmission in Semarang municipality exhibits spatial autocorrelation.Both studies have the same conclusion, that is there is a positive spatial autocorrelation in the number of dengue disease cases.
Another study was conducted by [5] using the Spatial Autoregressive (SAR) model, and the study's findings show that factors that significantly influence the number of dengue disease cases in Central Java Province are number of protected spring facilities, population percentage access to sustainable drinking water, population density, number of village polyclinics per 1,000 population, number of public health centers per 1,000 population, and percentage of clean water quality free of bacteria, fungi, and chemicals.In addition, research by [13] comparing Spatial Durbin Model (SDM) and SAR model revealed that SAR performed better than SDM in predicting the factors that influence the transmission of dengue disease in Central Java Province.In general, the number of residents and the average length of schooling are factors that affect the spread of dengue disease in Central Java Province.The two studies have something in common, that is the unit observation and analysis is the regency and municipality in Central Java Province.This study looks at how various environmental and social issues in Jakarta influence the

Data Analysis Steps
Data processing was carried out using R software version 4.1.2and thematic map creation using Q.GIS Desktop 3.16.15.Steps of data analysis were carried out as follows: 1. Exploring data for all variables using thematic maps so that the pattern of distribution of data between sub-districts can be known; 2. Before modeling with spatial regression, the classical assumptions of multiple linear regression models must be tested.[16]; 3. Before performing Moran's I test, it is necessary to create a spatial weight matrix.The most common way to represent spatial data relationships is through the concept of Contiguity.That is, areas will be considered related if their boundaries have the same points.In the concept of Queen Contiguity every region that touches the boundary of another region, either a side or a corner, is considered a neighbor [17].Queen Contiguity is the spatial weight matrix used in this study; 4. Checking whether there is an autocorrelation between sub-districts by conducting the Moran's I test (Moran index) [18].The hypothesis of Moran's I test is as follows:  0 : No spatial autocorrelation under given W (spatial weight matrix)  1 : There is a spatial autocorrelation under the given W Moran's I is defined: where  denotes the number of observations,  ̅ denotes the average value   from  Locations,   denotes the value at locations I,   denotes the value at locations j,   denotes the spatial weight matrix element and Moran's I test statistic is defined: where   denotes the Moran's I test statistic value, () denotes the expected value of Moran's I and () denotes Variance of Moran's I which defined: The decision criteria in making conclusions are to reject  0 if   >  2 ; 5. Checking whether there is a spatial dependence on lag or error by using the Lagrange Multiplier (LM) test.The Lagrange Multiplier (LM) test is used to determine the type of spatial analysis that is appropriate to use [19], [20].The general form of the SAR model is defined as follows [20]- [22]:  =  +  + ; ~(,    ) (8) where  denotes the response variable,  denotes the predictor variable,  denotes the spatial autocorrelation coefficient on the response variable,  denotes the spatial weight matrix,  denotes the intercept and regression coefficient and  denotes the error.The hypotheses for the spatial dependence on lag are as follows:  0 ∶  = 0 (No spatial dependence on lag)  1 ∶  ≠ 0 (Lag has a spatial dependence) The test statistic for the spatial dependence on lag are as follows: The decision criteria in making conclusions are to reject  0 if   >  (1−);=1

2
. If  0 is rejected, the Spatial Error Model (SEM) is used.If both   and   are significant, comparing the Akaike Information Criterion (AIC) values allows one of the best models to be chosen.The best model is the one with the lowest AIC value [16].The AIC calculated using the Maximum Likelihood Estimation (MLE) method is as follows [23]: where   denotes the Maximum log-likelihood and  denotes the number of model parameters; 6. Estimate the parameters of the SAR model.The SAR model is a model whose dependent variables are spatially correlated.Parameter estimation using the Maximum Likelihood method is defined as follows [24], [25]: with  ̂ is a regression parameter estimator based on weight matrix (W) and spatial autocorrelation .Equation ( 12), however, cannot be directly solved because the value  is unavailable.As a result, the log-likelihood concentrated function (  ) is employed, as defined below [18]: with C is a constant.Equation ( 13) is a non-linear function in one parameter and is maximized using a numerical technique with direct search; 7. Interpretation of the obtained SAR model, including the direct impact of covariates; 8. Diagnostic testing of the SAR model.

Descriptive Analysis
Figure 1 depicts the distribution of dengue disease cases number in the Jakarta province.Considering Figure 1 The sub-districts with a high dengue disease cases number are shown in dark red, with the majority located in the municipalities of East Jakarta and North Jakarta and a tiny portion in the municipalities of West Jakarta and South Jakarta.The sub-districts with a moderate number of dengue disease cases are denoted in light red, with the majority found in the municipalities of West Jakarta and South Jakarta and a tiny portion in the municipalities of East Jakarta and North Jakarta.And the sub-districts with a low number of dengue disease cases are marked in white, with the majority of these sub-districts being in the municipality of Central Jakarta and a tiny portion in the municipalities of South Jakarta and North Jakarta.This indicates that sub-districts with high dengue disease cases likely to be located near sub-districts with moderate dengue disease cases.
Figure 2 shows that Cilincing, Koja, Tanjung Priok, Cengkareng, and Pulo Gadung are the sub-districts with the highest number of flood-prone points.In comparison to dengue disease cases number in these sub-districts, the number of dengue disease cases is also high.In addition, it can be noticed that Pademangan, Taman Sari, Gambir, Senen, Menteng, Johar Baru, and Cilandak are the sub-districts with the lowest flood-prone points.Moreover, as compared to dengue disease cases number in these sub-districts, the number of dengue disease cases is comparatively low.This indicates that there is a positive correlation between the number of flood-prone points and the dengue disease cases number in Jakarta Province.
Figure 2 also shows that Cilincing, Koja, Cengkareng, and Jatinegara are the subdistricts with the highest number of slum neighborhood associations.In comparison to dengue disease cases number in these sub-districts, the number of dengue disease cases is also high.Cempaka Putih and Pancoran can be noted to have the fewest slum neighborhood associations.Comparatively to dengue disease cases in these subdistricts, the number of dengue disease cases is comparatively low.This indicates a substantial correlation between the number of slum neighborhood associations and dengue disease cases number in Jakarta Province.
We can see in Figure 2 that Koja, Kramat Jati, and Jatinegara are the sub-districts with a high population density, and these sub-districts also have a significant prevalence of dengue disease cases.Penjaringan and Cilandak are the sub-districts with the lowest population density, and both sub-districts also have the lowest number of dengue disease.This indicates a positive correlation between population density and the number of cases of dengue disease in Jakarta Province.
We can also see in Figure 2 that Cilincing, Cakung, Cengkareng, and Cipayung are sub-districts with a low number of hospitals per 1,000 populations.However, when compared with dengue disease cases number, these sub-districts are classified as sub- districts with a high dengue disease cases number, indicating a negative relationship between the number of hospitals per 1,000 populations and the number of cases of dengue disease in Jakarta Province.

Figure 2. Map of Distribution of Predictor Variable
And we can also notice from Figure 2 the sub-districts with the lowest public health centers per 1,000 people number are Cakung, Koja, Cengkareng, and Ciracas.However, when compared to the number of dengue disease cases, these sub-districts are classified as sub-districts with a high number of dengue disease cases.This indicates a negative correlation between the number of public health centers per 1,000 people and dengue disease case number in Jakarta Province.
Relying on Figures 1 and 2, We can conclude that descriptively there is a relationship between the number of cases of dengue disease and all predictor variables used in this study.However, in order to be more convincing, a spatial regression analysis must be performed.

Spatial Analysis
Table 2 shows the results of the linear regression model's classical assumption test.The p-values of the normality assumption test and the homoscedasticity assumption test are greater than 0.05, implying that the normality and homoscedasticity assumptions are fulfilled.Table 2 also shows that the Durbin Watson () value is between 4−<<4− which can be concluded that the non-autocorrelation assumption is fulfilled, and the VIF value lower than 5 for all variables can be concluded that the non-multicollinearity assumption is fulfilled.Even though the linear regression model has all of the assumptions fulfilled, it is still necessary to investigate whether there is an autocorrelation between sub-districts by performing the Moran's I test.Before carrying out the Moran's I test, a spatial weight matrix is needed.And the spatial weight matrix used in this study is the neighbor (Contiguity) spatial weight matrix with the neighboring type is (Queen).The following table shows the results of the spatial autocorrelation test using Moran's I test.3, there is a spatial autocorrelation in dengue disease cases number because the p-value is less than 0.05.Two of the five predictor variables also showed that there was a spatial autocorrelation.Moran's I statistical values are all between 0 and 1, indicating that the closer an area is, the more similar the variable values are.The Likelihood Ratio (LR) test is used to determine which model performs better, spatial regression or linear regression.And the Lagrange Multiplier (LM) test is employed to find whether the spatial dependence is on the dependent variables (lag), on unresearched variables (error), or both (error and lag).Table 4 shows the output of the LR and LM tests that were performed.
According to the results in Table 4, the LR test is significant because the p-value is less than 0.05, indicating that there is a significant difference between the spatial regression model and the linear regression model.It can also be seen that the AIC of the linear regression model is the largest when compared to the two spatial regression models.And it is clear that both the spatial dependence in lag and the spatial dependence in error are significant, as indicated by the p-value less than 0.05.However, this study use the SAR model because its AIC value is lower than the AIC value in the SEM model.Table 5 below shows the estimation results of the SAR model parameters.According to Table 5, the Spatial lag variable (Rho) has a statistically significant and positive coefficient, indicating that as dengue disease cases number in one sub-district increases, so will dengue disease cases number in neighboring sub-districts.And the Wald test p-value is less than 0.05, indicating a significant relationship between the number of dengue disease cases in Jakarta Province and all predictor factors.
Dengue disease cases number in Jakarta Province is significantly affected by the number of flood-prone points, the number of slum neighborhood associations, the population density, the number of hospitals and the number of public health centers per 1,000 populations, and the spatial lag.At a significance level of 5%, only the number of flood-prone points and spatial lag have a significant impact on the number of cases of dengue disease in Jakarta Province.This is consistent with Lilis Wijaya's 2018 research, which found that floods led to post-flood ailments such as diarrhea, dengue disease, Leptospirosis, Acute Respiratory Infection (ARI), intestinal worms, skin problems, and many more [26].Wang et al. (2016) reported that the Aedes Aegypti mosquito will live longer if the humidity level is high, such as during the rainy season, particularly in areas prone to floods, where the huge volume of standing water will make the disease more likely to spread [2].
The number of slum neighborhood associations and the number of public health centers per 1,000 people has a same coefficient sign with previous studies [1], [5].The sign of the coefficient indicates the direction of the relationship between the predictor variable and the number of cases of dengue disease.Although in this study the effect is almost significant with the p-value still below 0.2, which means the error rate is still below 20%.Meanwhile, the population density variable and the number of hospitals per 1,000 population have different coefficient signs from previous studies [5], [6].However, the pvalue shows that the effect is highly insignificant because the p-value exceeds 0.5, which means the error rate is above 50%.Based on the parameter estimation results in Table 5 The impact of covariates in the SAR model could be classified into three categories: total impact; indirect impact; and direct impact; [16].Total impact refers to the changes that occur in one sub-district as a consequence of changes in that sub-district and its surroundings.Indirect impact refers to the effect that takes place when the predictor factors in the bordering sub-district change.And impacts that occur locally in an area, which in this study is a sub-district, as a consequence of changes in predictor factors in that sub-district are referred to as direct impacts.The magnitude of direct and indirect impact of Table 6 shows the SAR model used in this study.Table 6 shows the number of flood-prone points in Jakarta Province has a substantial direct effect on the number of dengue disease cases.No variable has a significant indirect effect on dengue disease in Jakarta Province, based on an examination of indirect effects.The number of flood-prone points in Jakarta Province has a significant impact on the total number of dengue disease cases.The growth in the number of flood-prone points will have a direct impact on dengue disease cases number in Jakarta Province.Each one percent increase in flood-prone points in a sub-district will result in a rise of 3.86 cases of dengue disease in that sub-district.It is important to conduct a diagnostic test that involves the assumptions of homogeneity, normality and non-autocorrelation to determine the quality of the SAR model.The SAR model satisfies all the assumptions, as shown in Table 7.

CONCLUSIONS
Moran's I test showed the dengue disease cases number, the number of flood-prone points, and the population density have spatial correlation whereas the number of slum neighborhood associations, the number of hospitals per 1,000 populations, and the number of Public Health Centers per 1,000 population have no geographical correlation.
Relying on the Lagrange Multiplier test, the best spatial model is the Spatial Autoregressive (SAR) model.The growth in the flood-prone points number in Jakarta Province will have a direct effect on the number of instances of dengue disease.Each additional one percent of flood-prone points in a sub-district will result in 3.86 additional cases of dengue disease in that sub-district.The derived SAR model is valid since it satisfies the assumptions of normality, absence of autocorrelation, and homogeneity.
Recommendations are made for the Jakarta Provincial Government to enhance flood management regulations in order to lower the incidence of dengue disease cases.Additional variables with a substantial association to the frequency of dengue disease cases can be included in future investigations.In addition, the Spatial Durbin Model (SDM) technique can be utilized for more research on the number of dengue disease cases if all predictor variables exhibit a spatial lag.Or Geographically Weighted Poisson Regression (GWPR) can also be utilized, which overcomes the presence of spatial heterogeneity in the response data in the form of count data (amount).Or the Conditional Autoregressive-Bessag York Mollie (CAR-BYM) can also be utilized, which can accommodate geographical and non-spatial features induced by the heterogeneity of cases between regions.

Figure 1 .
Figure 1.Map of the Distribution of Dengue Disease Cases in Jakarta Province

Table 1 . Data Source and Variable Name Notation Variable Name Data Source
is rejected, the Spatial Autoregressive (SAR) model is used.And here are the test statistics for spatial dependence on error: Spatial Analysis of Dengue Disease in Jakarta ProvinceMuhamad Sobari 539

Table 2 .
The results of the classical assumption of linear regression model

Table 3 .
Spatial Autocorrelation Test Results with Moran's I *) SignificantAccording to Table

Table 4 .
LR dan LM Test Results and AIC value

Table 5 .
Estimation of SAR model parameters *) Significant , the form of the SAR model in this study is as follows:

Table 6 .
Measures of direct, indirect and total impact of The SAR model *) Significant

Table 7 .
SAR Model Diagnostic Test