Covariance Estimation for Compositional Data via Composition-Adjusted Thresholding

来源 :上海交通大学 | 被引量 : 0次 | 上传用户:woshishaoqiaolin
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  High-dimensional compositional data arise naturally in many applications such as metagenomic data analysis.The observed data lie in a high-dimensional simplex,and conventional statistical methods often fail to produce sensible results due to the unit-sum constraint.In this article,we address the problem of covariance estimation for high-dimensional compositional data,and introduce a composition-adjusted thresholding(COAT)method under the assumption that the basis covariance matrix is sparse.Our method is based on a decomposition relating the compositional covariance to the basis covariance,which is approximately identifiable as the dimensionality tends to infinity.
其他文献
会议
河流生境质量评价可以反映河流生态系统健康程度,并有助于辨识河流生态退化的原因,对河流修复和生态保护具有重要意义。江苏北部的邳州和丰县分别属于南水北调东线主要过境地区和相对缺水地区,河流生境质量对于保障供水质量具有重要意义。本研究建立了包含河水(河道)生境、河岸生境和滨岸带生境3 个方面共10 项的河流生境质量评价指标体系,以2014 年徐州市丰县和邳州市境内13 条河流的16 个河段生境质量调查为
会议
流域生态安全是生态安全研究领域的一个重要分支,而且流域生态系统内部和流域内各子系统之间都不是封闭的,各要素间均存在着相互作用。选择渭河流域陕西段为研究区,依据"压力-状态-响应"模型建立格网化流域生态安全评价体系,在格网GIS 技术的支持下,对评价体系中各指标进行1Km*1Km尺度下的网格化表达,实现多要素评价模型的生态安全评价,并在此基础上,运用空间马尔科夫链方法分析生态安全时空演变特点,动态模
会议
会议
会议
Discoveries and analyses of genetic variants at a gene or exome based on high-throughput sequencing technology are increasingly feasible.Although many association tests have already been proposed in l
Graphical models are essential to describe networks among different factors.Various methods to construct graphs have been proposed.In practice,it is critical to assess the agreement between networks c
Hepatocellular carcinoma(HCC)is the worlds 3rd leading cause of cancer-related deaths.83%of patients die within 5 years,and it is now one of the fastest growing cancers in the US.Early detection of HC
An unmet significant challenge in the treatment of many early-stage cancers is the lack of effective prognostic models to identify patients who are at high risk of disease progression from a large num
Competing risks framework,where an individual fails due to multiple causes,is frequently available in biomedical studies.Recently,joint modeling of longitudinal measurements and survival endpoints of