论文部分内容阅读
图像镶嵌是遥感图像处理中的重要内容,在跨区域遥感图像分析中发挥重要作用。为了解决传统遥感图像并行算法中存在的计算节点利用率低、频繁数据I/O等问题,本文根据Spark分布式内存计算框架,充分利用Spark利于迭代数据处理的优势,提出了一种基于Spark自定义RDD(弹性分布式数据集)的并行镶嵌方法。该方法首先在集群的多个节点上通过相位相关法执行图像重叠区域估计操作,从而提高了图像重叠区域估计的多节点并行计算;然后,通过重写Spark中RDD的compute和get Partitions方法,自定义针对遥感图像处理的RDD,并将图像镶嵌中的重叠区域估计、图像配准和图像融合3个关键步骤作为自定义RDD的Transformation类型的操作算子;最后,通过隐式转换创建自定义RDD,并调用自定义RDD的操作算子实现图像镶嵌的并行处理。实验结果表明,与传统基于MPI的并行镶嵌算法相比,该方法在保证图像镶嵌效果的基础上,能够有效提高大数据量的图像镶嵌效率。
Image mosaic is an important part of remote sensing image processing and plays an important role in the analysis of inter-regional remote sensing images. In order to solve the problems of low computational load and frequent data I / O in the traditional remote sensing image parallel algorithms, according to Spark distributed memory computing framework, taking full advantage of the advantages of Spark in iterative data processing, Define RDD (flexible distributed data set) parallel mosaic method. Firstly, the method of image overlap region estimation is implemented by phase-correlation method on multiple nodes of the cluster, which improves the multi-node parallel computation of image overlap region estimation. Then, by rewriting the compute and get Partitions methods of RDD in Spark, Define the RDD for remote sensing image processing, and use the three key steps of image mosaic, overlap area estimation, image registration and image fusion as the operator of the RDD Transformation type; finally, create a custom RDD by implicit conversion , And call the operator of custom RDD to realize the parallel processing of image mosaic. Experimental results show that compared with the traditional parallel mosaic algorithm based on MPI, this method can effectively improve the image mosaic efficiency of large data volume on the basis of ensuring the mosaic effect.