论文部分内容阅读
目的寻找适合于构建人体测量指标生长标准曲线的方法和途径。方法以构建广州市胎龄别新生儿出生体重百分位曲线为例,从Tukey方法、稳健性回归和高斯混合模型等三种异常数据识别和剔除方法中获得最佳数据预处理效果,然后对比三次样条方法、LMS方法和GAMLSS方法对百分位数曲线构建的影响。结果高斯混合模型对多峰分布数据中的主要分布识别比较理想,而对单峰分布,稳健性回归比Tukey方法更加可靠。而从拟合优度以及小于胎龄儿(SGA)、大于胎龄儿(LGA)的识别能力看,GAMLSS构建的胎龄别新生儿出生体重百分位曲线比三次样条和LMS方法估计精度更高。结论数据预处理过程应根据数据分布的特点选用合适的异常值识别和剔除方法,而曲线光滑过程中,GAMLSS方法可以对四阶矩进行建模,得到的百分位数曲线平滑且误差更小。
Objective To find suitable methods and ways to construct the standard curves for the growth of human body measurements. Methods Taking the birth weight percentile curve of neonates of gestational ages in Guangzhou as an example, the best data preprocessing results were obtained from Tukey method, robust regression and Gaussian mixture model, and then compared Influence of cubic spline method, LMS method and GAMLSS method on percentile curve construction. Results The Gaussian mixture model is more ideal for the identification of major distributions in multimodal distributions. For unimodal distributions, the robust regression is more reliable than the Tukey method. From the goodness of fit and the recognition ability of SGA and LGA, the birth weight percentile curve of gestational age neonates constructed by GAMLSS is better than that of cubic spline and LMS methods higher. Conclusion The data preprocessing should be based on the characteristics of data distribution selection appropriate outlier detection and removal methods, and curve smoothing process, GAMLSS method can be modeled on the fourth moment, the percentile curve obtained is smoother and less error .