Some families of count distributions for modelling zero-inflation and dispersion / Low Yeh Ching
A popular distribution for the modelling of discrete count data is the Poisson distribution. However, count data usually exhibit over dispersion or under dispersion when modelled by a Poisson distribution in empirical modelling. The presence of excess zeros is also closely related to over dispers...
| Main Author: | |
|---|---|
| Format: | Thesis |
| Published: |
2016
|
| Subjects: | |
| Online Access: | http://studentsrepo.um.edu.my/6686/ http://studentsrepo.um.edu.my/6686/1/yeh_ching.pdf |
| Summary: | A popular distribution for the modelling of discrete count data is the Poisson
distribution. However, count data usually exhibit over dispersion or under dispersion
when modelled by a Poisson distribution in empirical modelling. The presence of excess
zeros is also closely related to over dispersion. Two new mixed Poisson distributions,
namely a three-parameter Poisson-exponentiated Weibull distribution and a fourparameter
generalized Sichel distribution is introduced to model over dispersed, zeroinflated
and long-tailed count data. Some of the theoretical properties of the
distributions are derived and the distributions' characteristics are studied. A Monte
Carlo simulation technique is examined and employed to overcome the computational
issues arising from the intractability of the probability mass function of some mixed
Poisson distributions. For parameter estimation, the simulated annealing global
optimization routine and an EM-algorithm type approach for maximum likelihood
estimation are studied. Examples are provided to compare the proposed distributions
with several other existing mixed Poisson models. Another approach to modelling count
data is by examining the relationship between the counts of number of events which has
occurred up to a fixed time t and the inter-arrival times between the events in a renewal
process. A family of count distributions, which is able to model under- and over
dispersion, is presented by considering the inverse Gaussian distribution, the
convolution of two gamma distributions and a finite mixture of exponential distributions
as the distribution of the inter-arrival times. The probability function of the counts is
often complicated thus a method using numerical Laplace transform inversion for
computing the probabilities and the renewal function is proposed. Parameter estimation
with maximum likelihood estimation is considered with applications of the count
distributions to under dispersed and over dispersed count data from the literature. |
|---|