论文部分内容阅读
Data contamination refers to the phenomenon where part of the data is randomly replaced by data generated from an unknown distribution.It applies to a wide range of real world problems related to data quality,such as label noise,drifts in data distribution,and random errors in data entry etc.The impact of data contamination to the accuracy of pattern classification is studied and an asymptotic error bound is established.Several applications will be discussed for which the model gives insights.