Estimation Parameters and Modelling Zero Inflated Negative Binomial

Regression model between predictor variables and the Poisson distributed response variable is called Poisson Regression Model. Since, Poisson Regression requires an equality between mean and variance, it is not appropriate to apply this model on overdispersion. Poisson regression can be used to analyze count data but it has not been able to solve problem of excess zero value on the response variable. An alternative model which is more suitable for overdispersion data and can solve the problem of excess zero value on the response variable is Zero Inflated Negative Binomial (ZINB). In this research, ZINB is applied on the case of Tetanus Neonatorum in East Java. The aim of this research is to examine the likelihood function and to form an algorithm to estimate the parameter of ZINB and also applying ZINB model in the case of Tetanus Neonatorum in East Java. Maximum Likelihood Estimation (MLE) method is used to estimate the parameter on ZINB and the likelihood function is maximized using Expectation Maximization (EM) algorithm. Test results of ZINB regression model showed that the predictor variable have a partial significant effect at negative binomial model is the percentage of pregnant women visits and the percentage of maternal health personnel assisted, while the predictor variables that have a partial significant effect at zero inflation model is the percentage of neonatus visits.


INTRODUCTION
Regression analysis is used to determine relationship between one or several response variable (Y) with one or several predictor variables (X).In the classical linear model assumptions are response variables follow a normal distribution, but in fact often found the response variable did not follow the normal distribution.To overcome this there is development in the classical linear model, namely the Generalized Linear Model (GLM) [1].GLM assuming the response variable follows the exponential family distribution, which has a more general characteristic.In some research, there are often data with response variable that follows a Poisson distribution, regression analysis is used to this kind of data is the Poisson regression analysis.Poisson regression model is commonly used to analyze the data count (data count).Poisson regression there is an assumption on Y ~ Poisson (μ).A key assumption in the Poisson regression analysis is the variance should be equal to the average, the condition is called equidispersion.On the type of count data often encountered zero value is more than 50 percent on the response variable (zero inflation) [2].Data proportion that has exaggeration zero value can lead to the accuracy of inference.Poisson regression can be used to analyze the data count but still cannot resolve the problem of excessive zero value.In modelling count data if there ar many zero observations on response variable it can be overcome by using Zero inflated Poisson regression (ZIP) model [3].However, if there are many zero observations and occurs overdispersion then Zero inflated Poisson regression (ZIP) inappropriately used.Overdispersion can be defined as a condition in which the Poisson distribution variance is greater than average.If in modelling count data (data count) there are many zero observations on response variable (zero inflation) and occurs overdispersion then the regression model can be used is Zero Inflated Generalized Poisson [2].
In progress there are other alternatives to modelling many zero observations and occurs overdispersion besides using Zero Inflated Generalized Poisson (ZIGP), the regression model is Zero Inflated Negative Binomial (ZINB).Zero Inflated Negative Binomial (ZINB) model is formed of Poisson Gamma mixture distribution [4].Zero Inflated Negative Binomial (ZINB) can be used as an alternative to modelling many zero observations and occurs overdispersion because this model does not require the variance should be equal with average, in addition Zero inflated Negative Binomial (ZINB) model also has a dispersion parameter that useful to describe the variation of the data, which is commonly denoted by κ (kappa).The purpose of this research is examine the likelihood form, estimation parameters of Zero inflated Negative Binomial (ZINB) model and modelling Zero inflated Negative Binomial (ZINB) on Neonatorum Tetanus cases.

METHODS
In this research used secondary data sourced from East Java Health Profile 2012 [5].Unit of observation in this research was 38 districts/cities in East Java province which covers 29 districts and 9 Cities.The response variable (Y) used in this research is number of cases of Tetanus Neonatorum in each district/city in East Java province, while the predictor variable (X) is used as much as 4 variables.Operational definition of each variable response and predictor variables will be described as a.

RESULTS AND DISCUSSION
Estimation parameter Zero inflated Negative Binomial (ZINB) was conducted using Maximum Likelihood Estimation (MLE) and to maximize the function is used EM (Expectation Maximization) algorithm.Probability Function of Zero inflated Negative Binomial (ZINB) model can be defined as : )   ,    > 0 EM algorithm consists of two stage, expectation and maximization stage.Expectation stage is expectation calculation of ln likelihood the function, the next stage maximization is calculation to look for estimation parameter which maximizes the likelihood function.Probability function of ZINB model consist of two conditions, yi = 0 and yi > 0. Response variable is also composed of two conditions, namely zero state and negative binomial state.To describe in detail the condition yi, then it will be redefined variables yi with latent variable Zi.EM algorithm is alternative methods to maximize likelihood function on the data containing latent variables defining new variables such as variable Zi.EM algorithm consists of two stage: the expectation stage and maximization stage.Expectation stage is calculation of the ln likelihood function, the next stage is maximization calculation stage to look for parameter estimation which maximizes the likelihood function ln results from stage earlier expectations.Estimation parameter and parameter test of Zero inflated Negative Binomial (ZINB) on Neonatorum Tetanus cases in East Java Province using SAS software, the result can see at table 1 Let's discussion, based on the negative binomial state model and zero inflation state model, there are signs of regression coefficient as opposed to the theory are percentage of maternal mothers assisted by health workers (X3) to model negative binomial state model and the percentage of pregnant mothers visit K4 (X1) and the percentage of maternal mothers assisted by health workers (X3) for zero inflation state model .The existence of the regression coefficient has a sign contrary to the theory of probability caused by the effect of the multikolinieritas.Moreover sign contrary to the theory also caused by the shape of the data pattern of the predictor variables that have a positive correlation with the response variable.In a subsequent study if there are multikolinieritas the predictor variables can be addressed using Principal Component Analysis (PCA).

CONCLUSION
Based on the results, estimation parameter of Zero inflated Negative Binomial (ZINB) model was conducted using Maximum Likelihood Estimation (MLE) and to maximize the likelihood function used the EM (Expectation Maximization) algorithm.For parameter test predictor variable that has significant effect on the number of cases of Tetanus Neonatorum are are pregnant mothers visit K4 (X1) and maternal mothers assisted by health workers (X3) for the negative binomial state models, while zero inflation state model predictor variable that has significant effect on the number of cases of Tetanus Neonatorum include the percentage of neonates visit (X4).

𝑍 𝑖 = { 1 ,
0,    > 0    Zero inflated Negative Binomial regression (ZINB) model can be defined as two models that are : Model for negative binomial ̂  ̂ =  ̂0 + ∑  ̂0   =1 ,  = 1,2, … ,   = 1,2, . .,  Model for zero inflation  ̂  ̂ =  ̂0 + ∑  ̂0   =1 ,  = 1,2, … ,   = 1,2, . ., The method of analysis in this study is.a. Knowing the probability function of Zero inflated Negative Binomial (ZINB) model.b.Determining the likelihood function of Zero inflated Negative Binomial (ZINB) model based on probability function that are already known.c.Develop algorithms for estimation parameter process based on the likelihood function that is already known.Parameter estimation of Zero inflated Negative Binomial (ZINB) model.was performed using MLE method and solved using EM algorithm.d.Modelling Zero inflated Negative Binomial (ZINB) model on Neonatorum Tetanus cases in East Java Province e. Significance test of parameters model carried out simultan and partial test.Statistical tests are used for simultan test is the test statistic G and to partial test used test statistics t.
The response variable (Y): Number of cases of Tetanus Neonatorum b.Predictor variable (X) 1.The percentage of pregnant mothers visit K4 (X1) 2. Percentage of immunization Tetanus Toxoid (TT) in pregnant women (X2) 3. Percentage of maternal mothers assisted by health workers (X3) 4. The percentage of neonates visits (X4)

Table 1 .
. Estimation Parameter and Parameter Test of ZINB According to Table1, there are two predictor variables in negative binomial state model and one predictor variables in zero inflation state model that has t value greater than or equal to t (α / 2; 37 = 2.00) and has p-value less than α (0.05).This indicates that the predictor variables were partial significant effect in negative binomial state model are pregnant mothers visit K4 (X1) and maternal mothers assisted by health workers (X3), while the predictor variables were partial significant effect in zero inflation state model is the percentage of neonates visits (X4).So that Zero inflated Negative Binomial (ZINB) model can be defined as : a. Negative binomial state model for  ̂  ̂= (−5,847 − 0,145  1 − 0,006  2 + 0,233  3 − 0,023  4 )All coefficient parameter which aren't significant still is exist in Negative binomial state Zero inflation state model because it is intended to determine the contribution of each predictor variable on the response variable can be defined as : Zero inflation model for ˆ  1.Each additional 1 percent of pregnant mothers visit K4 (X1) it will increase the chances of the number of Tetanus Neonatorum by exp (0.223) = 1.249 times the number of cases of Tetanus Neonatorum original, if the other variables constant value.2. Each additional 1 percent immunization Tetanus Toxoid (TT) in pregnant women (X2) will decrease the chances of the number of Tetanus Neonatorum by exp (0.296) = 1.344 times the number of cases of Tetanus Neonatorum original, if the other variables constant value.3.Each additional 1 percent of maternal mothers assisted by health workers (X3) then it will increase the chances of the number of Tetanus Neonatorum by exp (0.835) = 2.305 times the number of cases of Tetanus Neonatorum original, if the other variables constant value.4. Each additional 1 percent of neonates visits (X4) will decrease the chances of the number of Tetanus Neonatorum by exp (1.078) = 2.939 times the number of cases of Tetanus Neonatorum original, if the other variables constant value.