Ir al contenido

Documat


Resumen de New models of count data with applications

Amanda Fernández Fontelo

  • Since count data are present in the nature of many real processes, the need for high-quality methods and techniques to accurately model and analyse these data is irrefutable. In this sense, in the past years, many comprehensive works have been presented in the literature where both, primary and more general methods to deal with count data, have developed based on different approaches. Despite the vast amount of excellent works dealing with the major concerns in count data, some issues related to these data remain to be addressed.

    This Ph.D. thesis is aimed at introducing novel methods and techniques of count data analysis to deal with some issues such that the overdispersion, the zero-inflation (and zero-deflation), and the phenomenon of under-reporting. In this sense, this thesis comprises different publications where innovative methods have been presented and discussed in detail.

    In particular, two of these articles [1, 2] are focused on the assessment of the under-reporting issue in count time series. These works propose two realistic models based on integer-valued autoregressive models. Besides, real-data applications within different frameworks are studied to demonstrate the practicality of these proposed models. On the other hand, the paper by [3] proposes a general model of count time series, which considers slightly overdispersed data, even if a series is non-stationary. This model has been used to analyse data of fallen cattle collected at a local scale when series have low counts, many zeros, and moderate overdispersion as part of a project commanded by the Ministry of Agriculture, Food and Environment of Spain. The last paper included in this thesis [4] proposes an exact goodness-of-fit test for detecting zero-inflation (and zero-deflation) in count distributions within the biological dosimetry framework. The test suggested in [4] was firstly introduced by [5] derived from the problems of occupancy. In the biological dosimetry context, this test is viewed as a complement to the always used u-test, when data are not overdispersed (not underdispersed), but they are zero-inflated (zero-deflated).

    The methods introduced in this Ph.D. thesis can be viewed as small but relevant signs of progress in count data analysis. They allow studying several issues of count data from different points of view, showing especially good results when dealing with some real-world concerns in public health and biological dosimetry frameworks. Although this work constitutes an advance in count data analysis, more efforts have to keep doing to improve the existing techniques and tools.

    [1] Fernández-Fontelo, A., Cabaña, A., Puig, P. and Moriña, D. (2016). Under-reported data analysis with INAR-hidden Markov chains. Statistics in Medicine; 35(26): 4875-4890.

    [2] Fernández-Fontelo, A., Cabaña, A., Joe, H., Puig, P. and Moriña, D. Count time series models with under-reported data for gender-based violence in Galicia (Spain). Submitted.

    [3] Fernández-Fontelo, A., Fontdecaba, S., Alba, A. and Puig, P. (2017). Integer-valued AR processes with Hermite innovations and time-varying parameters: An application to bovine fallen stock surveillance at a local scale. Statistical Modelling; 17(3): 172-195.

    [4] Fernández-Fontelo, A., Puig, P., Ainsbury, E.A. and Higueras, M. (2018). An exact goodness-of-fit test based on the occupancy problems to study zero-inflation and zero-deflation in biological dosimetry data. Radiation Protection Dosimetry: 1-10.

    [5] Rao, C.R. and Chakravarti, I.M. (1956). Some small sample tests of significance for a Poisson distribution. Biometrics; 12: 264-282.


Fundación Dialnet

Mi Documat