Bayesian Covariance Modelling of Big Tensor-Variate Data Sets Inverse Non-parametric Learning Of th

来源 :The 24th International Workshop on Matrices and Statistics(第 | 被引量 : 0次 | 上传用户:shenlixi44
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Tensor-valued data are being encountered increasingly more commonly, in the biological, natural as well as the social sciences. The learning of the unknown model parameter vector given such data,involves covariance modelling of such data, but this can be difficult owing to the high-dimensional nature of the data, where the numerical challenge in such modelling can only be compounded by largeness of the data set. Assuming such data to be modelled using a correspondingly high-dimensional Gaussian Process, the joint density of a finite set of such data sets is then a tensor normal distribution,with density parametrised by a mean tensor and the k covariance matrices. When aiming to model the covariance structure of the data, we need to estimate the covariance matrices. We present a new method in which we perform such covariance modelling by first expressing the probability density of the available data sets as tensor-normal, i.e. the likelihood (of the unknown matrix and tensor-variate parameters of the GP that the data is modelled using, given the data) is tensor-normal. We then invoke appropriate (vague) priors on these unknown parameters and express the posterior of the unknowns given the data. We sample from this posterior using an appropriate variant of Metropolis Hastings. In order to reduce computational burden, the mean tensor is estimated by maximum likelihood estimation in a pre-processing step. We perform empirical illustration of the method using large three-dimensional astronomical and economic data sets of size N times M times P. To begin with, we choose to work with the squared exponential covariance function, leading us to learn (from the data), the correlation length-scales. Although the difficulty of learning the covariance model is reduced by the undertaken steps, inference with MCMC on these large, high-dimensional data sets is still time and resource intensive. This motivates us to use an efficient variant of the Metropolis-Hastings algorithm--Transformation based MCMC--employed to perform efficient sampling from a high-dimensional state space. In further applications, other kinds of covariance functions will be discussed as well. Once we perform the covariance modelling of such a data set, we will learn the unknown model parameter vector at which a measured (or test) data set has been obtained, given the already modelled data (training data), augmented by the test data.
其他文献
  With a history of more than 3000 years, magic squares still are mysterious in various aspects. We in this paper give a comprehensive review and study on cla
会议
  Research on quasi stationary distributions , is a very important research topic. When the sample size is not sufficiently large, the asymptotic results of t
  Integer-valued generalized autoregressive conditional heteroscedasticity (GARCH) models have played an important role in time series analysis of count data.
  We propose Partial Correlation Screening (PCS) as a new row-by-row approach. To estimate the i-th row of Ω, 1 ≤ i ≤ p, PCS uses a Screen step and a Clean
会议
  Methods of survival analysis (e.g., the Cox proportional hazards model) require that the event time be measured with respect to some origin time. The choice
会议
  Assume we have a sample of size n from a p-dimensional population with first four finite moments. We are interested in testing some basic hypothesis about t
会议
  We study the Cauchy problem of the nonlinear spatially homogeneous Boltzmann equation without angular cutoff. By using analytic techniques, we prove the Gev
会议
  This paper focuses on trajectory analysis that applies finite mixture modeling to longitudinal data. The paper introduces new modeling tools using semiparam
  The problem of testing the separability of a covariance matrix against an unstructured variance covariance matrix in the context of multivariate repeated me
Recurrent event data usually occur in long-term studies which concern recurrence rates of the disease.In studies of medical sciences,patients who have infected
会议