论文部分内容阅读
Generally, data is available abundantly in unlabeled form, and its annotation requires some cost. The labeling, as well as leing cost, can be minimized by leing with the minimum labeled data instances. Active leing (AL), les from a few labeled data instances with the additional facility of querying the labels of instances from an expert annotator or oracle. The active leer uses an instance selection strategy for selecting those critical query instances, which reduce the generalization error as fast as possible. This process results in a refined training dataset, which helps in minimizing the overall cost. The key to the success of AL is query strategies that select the candidate query instances and help the leer in leing a valid hypothesis. This survey reviews AL query strategies for classification, regression, and clustering under the pool-based AL scenario. The query strategies under classification are further divided into: informative-based, representative-based, informative- and representative-based, and others. Also, more advanced query strategies based on reinforcement leing and deep leing, along with query strategies under the realistic environment setting, are presented. After a rigorous mathematical analysis of AL strategies, this work presents a comparative analysis of these strategies. Finally, implementation guide, applications, and challenges of AL are discussed.