论文部分内容阅读
符号数据分析是一种新兴的数据挖掘技术,区间数是最常用的一种符号数据。研究应用区间型符号数据的PCA方法来评价股票的市场综合表现问题。首先介绍了符号数据分析的基本理论。接下来研究了区间数据样本的经验描述统计量的计算,并基于经验相关矩阵,给出了区间主成分分析的算法,该算法最终得到区间数表达形式的主成分取值。最后选取上海证券交易市场20支股票在某一周上的交易数据,进行了实证研究,基于区间主成分得分的矩形图表示,将20支股票按其市场综合表现分成了四类。
Symbol data analysis is a new data mining technology, the number of intervals is the most commonly used a symbolic data. Research PCA method using interval-based symbol data to evaluate the stock market performance. First introduced the basic theory of symbolic data analysis. Next, we study the calculation of empirical description statistics of interval data samples and give the algorithm of interval principal component analysis based on empirical correlation matrix. Finally, we obtain the principal component values of the interval number expression. Finally, we select 20 trading data of Shanghai Stock Exchange in one week and conduct empirical research. Based on the histogram of the principal component score of the interval, 20 stocks are divided into four categories according to their market performance.