基于图的半监督机器学习

来源 :浙江大学 | 被引量 : 0次 | 上传用户:slie726
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Over the past few years, semi-supervised learning has gained considerable interest and success in both theory and practice. Traditional supervised machine learning algorithms can only make use of labeled data, and reasonable performance is often achieved only with a large number of labeled data. However, labeled data is often expensive and time consuming to collect, while unlabeled data is usually cheaper and easier to obtain. The strength of semi-supervised learning lies in its ability to utilize a large quantity of unlabeled data to effectively and efficiently improve learning performance.Recently, graph-based semi-supervised learning algorithms are being intensively studied, thanks to its convenient local representation, connection with other models like kernel machines. Graph Laplacian is the central quantity of graph-based semi-supervised learning, which plays a role in exploring the underlying manifold geometry of the data. Using graph Laplacian to form the regularization problem and further employing the kernel techniques is a promising approach of semi-supervised learning.The author first introduce the basic concepts of semi-supervised learning, as well as the utilized tools and theory, such as support vector machines, kernel methods and regularization theory.The main contributions of this thesis are mainly presented in chapter 5 and chapter 6. In chapter 5, the author first investigate a class of graph-based semi-supervised learning methods by spectral transformation. Then the formulation of semi-supervised spectral kernel learning based on maximum margin criterion with spectral decreasing order constraints is formed, and he also maintain that the maximum margin criterion is a more essential goal of semi-supervised kernel learning than kernel target alignment by theoretical analysis. By equivalently transforming the resulted intractable optimization problem into a quadratically constrained quadratic programming, the problem can be efficiently solved. Moreover, the author also propose a method to automatically tune the involved trade-off parameter. Furthermore, the author seek another way to learn the spectral coefficients from a more essential view. Due to the fact that the spectral order constraints are actually not hard requirements but only for the purpose of ensuring the smoothness of the score function, the author leaves out those constraints by directly including the smoothness regularizer into the maximum margin objective, which coincides with the theory of manifold regularization. Its efficient iterative algorithm is also designed next. Experimental results on real-world data sets have demonstrated that both of his proposed spectral learning methods achieve promising results against other approaches.Motivated by the requirements of many practical problems, in chapter 6 the author turns to study the problem of semi-supervised learning with structured outputs, which is a more general topic than the standard semi-supervised learning. By extending the definition of smoothness regularizer to multi-class setting, he next explore the multi-class semi-supervised classification. Although the obtained data dependent kernel similar to that of Sindhwani et al., his multi-class model really extend the theory of theirs. Still next, the author further generalize the multi-class manifold regularization problem to the scenario with structured outputs, and the corresponding dual problems are also obtained. From the dual formulations, we can find that the semi-supervised learning task finally can be achieved by the supervised structural prediction with a newly defined "data dependent joint kernel matrix". This data dependent kernel matrix generalizes that of Sindhwani et al. to structural prediction. Moreover, his proposed inductive approach can naturally predict the unseen data points other than the unlabeled data. Some experiments on text categorization with hierarchies are conducted, and the empirical results show his approaches actually utilize the structural and manifold information of the data simultaneously, and finally help us to improve the prediction performance. As a supplement, the author also proposes the concept of joint Laplacian, which shares the similar properties of standard Laplacian matrix.
其他文献
将数学的问题用案例的形式来进行表达和教学,能使学生更好地掌握数学知识,也能使初中数学老师能有效地实现教学的目的。各学校应该借着新课改的实行进行初中数学的改革。本文
回 回 产卜爹仇贱回——回 日E回。”。回祖 一回“。回干 肉果幻中 N_。NH lP7-ewwe--一”$ MN。W;- __._——————》 砧叫]们羽 制作:陈恬’#陈川个美食 Back to yield
目的探讨在妊娠护理过程中心理护理干预的效果。方法随机从2014年1月至2015年1月来本院进行分娩的妊娠期孕妇中选出130例,将其随机分为两个小组,每组65例。对其中一组孕妇采
我国1997年设立了职务侵占罪,从此对公司、企业以及其他单位中非国家工作人员的贪污行为加以规制。近年来,职务侵占类犯罪在实务中发生率较高,比如送货员、快递员、临时工作
本体是对特定领域概念及其关系的清晰描述,它可作为不同信息系统之间信息交换的基础。然而,由于信息系统是由不同组织在不同时间和不同需求背景下开发的,因此,这些系统所依赖
为研究工业生产中应用的波纹换热管换热性能以及相关的影响因素,课题中应用FLUENT软件建立了6种当量直径相同,波高和波距不同的波纹管模型,以常温下水为工质,设置不同的雷诺数并
可靠性是单片机控制器能否成功应用的最关键问题.从多项应用工程中归纳出一些提高可靠性的方法,讨论了自动复位、模拟信号变换、开关信号变换硬件电路的设计,具体给出了实际
5月14日,包括医疗服务、生物医药、中医药在内的数十家国内、省内企业在南博会生物医药与大健康专题馆参展,同时派出代表就消费者关心的话题进行了解答。$$参展产品功能虽不尽
报纸
供应链企业间的信息共享作为供应链管理的重要组成部分,对供应链的高效运作起到了重要作用。针对农产品供应链,基于共同目标、共担风险、共享利益三个方面提出了供应链的结构
回 回 产卜爹仇贱回——回 日E回。”。回祖 一回“。回干 肉果幻中 N_。NH lP7-ewwe--一”$ MN。W;- __._——————》 砧叫]们羽 制作:陈恬’#陈川个美食 Back to yield