论文部分内容阅读
为了更好地发挥主动学习、半监督学习和集成学习这3种机器学习方法的优势,研究了1个不需要2个充分冗余视图、泛化能力强的高效学习算法。从聚类假设出发,给出每轮协同训练过程中添加自动标记样本的置信度度量方法,降低误标记率;提出作为主动选择未标记样本依据的贡献度的概念,贡献度越高的样本,越具有人工标记的价值,在协同训练迭代结束后,选择贡献度高的样本标记,就能增强反馈的效果,提升学习性能,提出一种基于主动学习的集成协同训练算法。应用于图像检索的实验结果表明,提出的算法是高效可行的。
In order to give full play to the advantages of three learning methods, active learning, semi-supervised learning and integrated learning, an efficient learning algorithm that does not require two fully redundant views and has extensive generalization ability is studied. Based on the clustering hypothesis, this paper presents a confidence measure method for adding auto-labeled samples during each round of collaborative training to reduce the false marker rate. The concept of contribution degree based on unscopied samples is proposed, the higher the contribution rate is, The more the value of human mark, after the cooperative training iteration, select the high contribution of the sample mark, can enhance the feedback effect, improve learning performance, and put forward a kind of integrated collaborative training algorithm based on active learning. The experimental results applied to image retrieval show that the proposed algorithm is efficient and feasible.