论文部分内容阅读
Machine Learning, Artificial Intelligence (AI) and Statistical Learning are related mathematical fields which utilize computer algorithms to create models for the purposes of data description and/or prediction. Some well known examples include biometric identification and authorization systems, speech recognition and user targeted internet advertising. Statistical Learning, which we will use in this paper, also has many applications in semiconductor manufacturing.Some of the challenging characteristics of semiconductor data include high dimensionality, mixtures of categorical and numeric data, non-randomly missing data, non-Gaussian and multimodal distributions, nonlinear complex relationships, noise, outliers and temporal dependencies. These challenges are becoming particularly acute as the quantity of available data increases and the ability to trace lots, wafers, die, and packages throughout the full fab, wafer test, assembly and final test manufacturing flow improves. Statistical-learning techniques are applied to address these challenges. In this paper we discuss the advancement and applications of Tree based classification and regression methods to semiconductor data. We begin the paper with a description of the problem, followed by and overview of the statistical-learning techniques we use in our case studies. We then describe how the challenges presented by semiconductor data were addressed with original extensions to tree-based and kernel-based methods. Next, we review four case studies: home sales price prediction, signal identification/separation, final speed bin classification and die pairing optimization for Multi-Chip Packages (MCP). Results from the case studies demonstrate how statistical-learning addresses the challenges presented by semiconductor manufacturing data and enables improved data discovery and prediction when compared to traditional statistical approaches.