Περίληψη : | Studying associations among multivariate outcomes is an interesting problem in statistical science. The dependence between random variables is completely described by their multivariate distribution. When the multivariate distribution has a simple form, standard methods can be used to make inference. On the other hand one may create multivariate distributions based on particular assumptions, limiting thus their use. Unfortunately, these limitations occur very often when working with multivariate discrete distributions. Some multivariate discrete distributions used in practice can have only certain properties, as for example they allow only for positive dependence or they can have marginal distributions of a given form. To solve this problem copulas seem to be a promising solution. Copulas are a currently fashionable way to model multivariate data as they account for the dependence structure and provide a flexible representation of the multivariate distribution. Furthermore, for copulas the dependence properties can be separated from their marginal properties and multivariate models with marginal densities of arbitrary form can be constructed, allowing a wide range of possible association structures. In fact they allow for flexible dependence modelling, different from assuming simple linear correlation structures. However, in the application of copulas to discrete data marginal parameters affect dependence structure, too, and, hence the dependence properties are not fully separated from the marginal properties. Introducing covariates to describe the dependence by modelling the copula parameters is of special interest in this thesis. Thus, covariate information can describe the dependence either indirectly through the marginal parameters or directly through the parameters of the copula . We examine the case when the covariates are used both in marginal and/or copula parameters aiming at creating a highly flexible model producing very elegant dependence structures. Furthermore, the literature contains many theoretical results and families of copulas with several properties but there are few papers that compare the copula families and discuss model selection issues among candidate copula models rendering the question of which copulas are appropriate and whether we are able, from real data, to select the true copula that generated the data, among a series of candidates with, perhaps, very similar dependence properties. We examined a large set of candidate copula families taking intoaccount properties like concordance and tail dependence. The comparison is made theoretically using Kullback-Leibler distances between them. We have selected this distance because it has a nice relationship with log-likelihood and thus it can provide interesting insight on the likelihood based procedures used in practice. Furthermore a goodness of fit test based on Mahalanobis distance, which is computed through parametric bootstrap, will be provided. Moreover we adopt a model averaging approach on copula modelling, based on the non-parametric bootstrap. Our intention is not to underestimate variability but add some additional variability induced by model selection making the precision of the estimate unconditional on the selected model. Moreover our estimates are synthesize from several different candidate copula models and thus they can have a flexible dependence structure. Taking under consideration the extended literature of copula for multivariate continuous data we concentrated our interest on fitting copulas on multivariate discrete data. The applications of multivariate copula models for discrete data are limited. Usually we have to trade off between models with limited dependence (e.g. only positive association) and models with flexible dependence but computational intractabilities. For example, the elliptical copulas provide a wide range of flexible dependence, but do not have closed form cumulative distribution functions. Thus one needs to evaluate the multivariate copula and, hence, a multivariate integral repeatedly for a large number of times. This can be time consuming but also, because of the numerical approach used to evaluate a multivariate integral, it may produce roundoff errors. On the other hand, multivariate Archimedean copulas, partially-symmetric m-variate copulas with m-1 dependence parameters and copulas that are mixtures of max-infinitely divisible bivariate copulas have closed form cumulative distribution functions and thus computations are easy, but allow only positive dependence among the random variables.The bridge of the two above-mentioned problems might be the definition of a copula family which has simple form for its distribution functionwhile allowing for negative dependence among the variables. We define sucha multivariate copula family exploiting the use of finite mixture of simple uncorrelated normal distributions. Since the correlation vanishes, the cumulative distribution is simply the product of univariate normal cumulative distribution functions. The mixing operation introduces dependence. Hence we obtain a kind of flexible dependence, and allow for negative dependence.
|
---|