基于自由能原理与自然图像统计的图像质量评价模型的研究

来源 :上海交通大学 | 被引量 : 0次 | 上传用户：mengshenabc

【摘要】

：

【作者】

：

朱文瀚

【机构】

：

上海交通大学

【出处】

：

上海交通大学

【发表日期】

：

2023年01期

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

With the advent of the information age,visual media have produced over billions of digital images,which have been gradually become an indispensable part of daily life.However,in the process of acquisition,storage,compression,and transmission,the quality of digital images is inevitably affected by different kinds of distortions.Most of the images in digital applications are produced for human consumers,and the visual quality of images has a direct impact on user experience.Measuring the quality of images is essential for designing better user systems with better visual experience.Image quality assessment（IQA）is dedicated to handle this problem,which can be divided into subjective IQA and objective IQA.Subjective IQA is generally considered to be the most accurate method since the human eye is the ultimate evaluator of visual experience.However,the slowness,expensiveness and laboriousness of subjective IQA immensely limit its applicable scenarios.Toward this end,abundant researchers have focused on objective IQA,which is classified into full-reference（FR）,reduced-reference（RR）and no-reference（NR）models based on the accessibility of the reference image.Although scholars have proposed plenty of objective models to automatically and accurately predict the perceptual quality of the target image,this field of study is still in its early stage.Due to lack of knowledge about the perception mechanism of the human visual system（HVS）,existing objective models have low interpretability and can not simulate the subjective perception accurately.In addition,they are mainly aimed at the simulated distortions,so their performance of them on distorted images in real-world scenarios is yet to be validated.Focusing on these issues,we conduct research on RR and NR metrics based on the free-energy principle in human visual perception mechanism and the natural image statistics modeling.This dissertation adds to this field of study with the following contributions:First,without an in-depth understanding of the perception mechanism of HVS,most objective algorithms measure the distortion degree of the signal itself to evaluate the image quality.Such models may not accurately assess the image quality since they do not take the human visual perception mechanism into consideration.Considering this issue,we present a RR IQA model based on the free-energy principle and multi-channel decomposition,which is called MCFRM（Multi-Channel Free-energy based Reducedreference quality Metric）.The free-energy principle,which has been widely researched in brain science and neuroscience,is introduced to quantify the perception,action,and learning in human brain.The free-energy principle indicates that the brain resorts to the internal generative model to explain the visual stimulus and yields the corresponding prediction.The discrepancy between the visual stimulus and its prediction is highly related to the perceptual quality.On the other hand,abundant psychological and neurobiological studies have revealed that different frequency and orientation components of one visual stimulus arouse different neurons in the striate cortex,and the striate cortex processes visual information in the cerebral cortex.Inspired by these two aspects,in MCFRM,a two-level discrete Haar wavelet transform（DHWT）is employed to decompose the input reference and distorted images at first.Next,the sparse representation is used to simulate the generative model in human brain,which is proved to approximate the strategy of the primary visual cortex in representing natural images.Then,for each portion of the reference and distorted images,the difference between the visual stimulus and its prediction produced by the internal generative model in human brain is computed based on the free-energy principle.The self freeenergy features and combined free-energy features of each pair of subband reference and distorted images are extracted.Finally,we obtain the overall quality metric by regressing these features using the support vector regressor（SVR）.Experimental results demonstrate that our proposed model is superior to the same kind of metrics and the time complexity of our method is lower than competitors.In addition,the proposed metric only needs four scalars extracted from the pristine image,which can alleviate the burden of information transmission for reference images.Second,since the human eye is the ultimate evaluator of visual experience,the modeling of HVS is a key issue for objective IQA and visual experience optimization.However,the traditional model based on black box fitting has low interpretability and can hardly guide the experience optimization effectively.It is difficult to integrate the model based on physiological simulation into practical visual communication services due to its high computational complexity.To bridge the gap between signal distortion and visual experience,we propose a novel perceptual NR IQA algorithm based on structural computational modeling of HVS called NSCHM（No-reference Structural Computational of Human visual system Metric）in this thesis.We deeply investigate and analyze the perception mechanism in HVS based on multi-layer representations of the image.Informed by image processing theories,we divide visual stimuli into three bottom-up layers,including a low-level visual layer,a middle-level visual layer,and a high-level visual layer.Specifically,for an image,the global image can be regarded as the high-level visual excitation.The local primitives obtained from the decomposition of the global image can be treated as the middle-level visual stimulus,while the low-level visual layer is composed of all individual pixels in the image.The mean subtracted contrast normalized（MSCN）coefficients of pixels based on natural scene statistics（NSS）are extracted as low-level visual features.The deep features based on multiple pseudo reference images through a VGG network are considered as middle-level features,which reflect the characteristics of the target image’s local primitives.The free-energy based features synthesized the properties of the whole image are treated as high-level features in the proposed method.SVR is used to aggregate these features extracted from three layers into a perception quality index.Extensive experiments illustrate that the proposed metric is effective and superior or comparable to the state-of-the-art NR models.Third,Ultra High-Definition（UHD）contents with 4K resolution have become a hot selling point of broadcast television programs and online media in recent years.However,plenty of pseudo 4K contents upscaled from lower resolutions are circulating on the network due to lower cost and technical difficulty of obtaining lower resolution content.Those“4K” programs usually frustrate the enthusiasm of consumers and waste tight bandwidth resources.To fill in this gap,this thesis puts forward an efficient NR IQA model to predict the perceptual quality of the UHD images named TPIM（True or Pseudo 4K Image quality assessment Metric）,which can distinguish between true and pseudo 4K images.Due to the lack of appropriate data sets,we first establish a database of true and pseudo 4K images by subjective experiments.This material database contains 350 natural 4K images with different image contents and 2,802 pseudo 4K images upscaled from1080 p and 720 p images through fourteen various interpolation algorithms including classical interpolation models,popular deep learning interpolation algorithms,softwares and hardwares commonly applied in the industry.In TPIM,to improve the computational efficiency,we divide a complete target image into 16×9 patches.Then,we employ local variance to obtain the most three representative three sub-images on behalf of the whole target image and extract the corresponding image complexity features.Next,we calculate the histogram features of the normalized power spectrum and cut-off frequency features of the cumulative power spectrum in the frequency domain as well as the NSS based features of the MSCN coefficients on these representative patches.The prediction index of TPIM is achieved via aggregating these three groups of features using SVR.The experimental results show that the proposed model outperforms the competitive NR IQA methods in predicting the quality of 4K images and has a great ability to identify true and pseudo4 K images.It is noteworthy that the proposed algorithm has been integrated into hardware instruments and applied to practical scenarios.

其他文献

电力设备状态监测声电传感阵列虚拟扩展与信号优化处理方法

电力设备是电力网络的基础组成单元,电力设备的安全运行对电网的供电可靠性有着决定性的影响。针对电力设备运行状态的在线监测不仅能够及时监测设备中的各类异常电气量以及非电气量,而且可以对电力设备的运行状态进行全面、系统地分析,及早发现设备中潜在的故障,近年来已被广泛研究。基于声电传感阵列的电力设备在线监测方法具有覆盖范围广、监测效率高以及抗干扰性能强等优点,已在部分变电站中开展试点应用。然而在实际应用过

学位

先验知识辅助的Top-N推荐算法研究

互联网、电子商务、云计算、移动计算等技术的快速发展推动了大数据时代的到来,由此产生的信息过载问题严重地制约着人们对有效信息的获取。推荐系统的提出和应用,缓解了这一问题。事实上,广义的推荐系统已经被广泛运用于网购平台、新闻阅读、社区服务等场景,提升了用户体验,同时也为平台创造了价值。因此,推荐系统已经成为人工智能领域重要的应用之一。推荐算法的本质是基于对历史数据的学习,提供符合人们预期的物品。然而,

学位

微波信号的光纤稳相传输技术研究

本世纪以来,分布式测量技术的迅速发展,被称为测量技术的一次变革。将同一个高精度射频参考高稳定地分配给各个测量装置,实现各测量点之间的同步是分布式测量的核心技术。而利用光纤的低损耗、高带宽、抗电磁干扰等特性,实现频率参考信号的高稳定分配,成为该技术领域的研究热点。但是光纤的传输延时容易受到外界压力、温度变化等环境因素的干扰而发生变化,导致传输后的信号相位不稳定（或频率稳定度恶化）,该问题成为利用光纤

学位

基于络病理论探讨针刺治疗干眼眼表疼痛

干眼是眼科常见疾病，临床上干眼患者常有不同程度的眼表疼痛症状。作为中医传统疗法之一，针刺常被用于本病的治疗且收效显著。近年来，随着络病理论发展，其在痛证的诊治过程中发挥着独特的指导作用。本文以络病理论为基础，探讨干眼眼表疼痛的针刺治疗，旨在促进络病理论的进一步发展，并为针刺治疗本病提供理论依据。

期刊

混合式教学的影响因素研究

随着信息化技术的不断发展,一种基于线上线下的混合式教学模式深受教育界的关注。所以混合式教学是信息技术和教学改革的深度融合,本文从学生因素、教师因素、学校因素分析影响教学效果的主要原因,通过分析,提出可以借鉴的优化策略。为高职院校的混合式教学提供可借鉴的意义。

期刊

小分子半导体涂布工艺及其薄膜晶体管器件应用的研究

得益于有机半导体材料（organic semiconductor,OSC）机械柔韧性好、可低温溶液法制备、可大面积涂布加工以及可通过化学剪裁改善材料性能等特点,有机薄膜晶体管（organic thin-film transistor,OTFT）具有兼容耐温较差的柔性衬底（纸张、塑料等）、根据目标应用定制化成本低等优势,在柔性显示、可穿戴电子等应用方面有着巨大的潜力。OTFT具备信号转换与放大的能力

学位

热晶体带隙特性的研究

光电子器件发展的趋势是小型化、集成化,但所伴随的发热问题也很突出。严重的发热问题会降低光电子器件的工作效率,甚至会减损光电子器件的使用寿命。因此,光电子器件的发热问题亟需解决。对于这一情况,2013年麻省理工学院的Maldovan教授提出了热晶体的概念,为光电子器件散热问题的解决提供了一种全新的思路。目前,对热晶体的研究主要集中于提高其绝热性能方面。其中,热晶体所面临的最大挑战在于普通半导体材料的

学位

基于硅基调制器的光频梳及光脉冲生成研究

光频梳和光脉冲在光通信、信号处理等领域有着重要的应用。作为光源,光频梳和光脉冲的性能将会对整个系统的表现产生重要影响。光频梳和光脉冲是对同一个光信号分别从光谱和时域观察的结果,即某个光信号在光谱上表现为光频梳,而在时域上表现为周期性的光脉冲序列。许多应用对光频梳和光脉冲提出性能要求,包括高重复频率（GHz级别至几十GHz级别）、低脉冲占空比等。特别地,奈奎斯特脉冲由于符合奈奎斯特零码间串扰准则,被

学位

无线广域网中端边传输协议研究

随着无线网络的快速发展,越来越多的智能终端加入到无线网络中。智能终端将获取的信息通过无线网络传递到核心网络中进行相应的处理。而作为信息传输的媒介,无线网络发挥着至关重要的作用。在物理空间中,各种不同类型的无线网络并存。例如,在无线局域网中,基于Zigbee技术的无线网络用于短距离的低功耗传感器网络。而以4G、5G、Lo Ra技术为代表的无线广域网支持广域覆盖范围内的数据传输。然而,伴随着越来越多智

学位

机载分布式共形阵列雷达信号处理若干关键问题研究

随着飞机平台对隐身化、智能化、全向感知等需求的日益增强,机载雷达也由传统机头单平面阵向“随机”布阵的蒙皮化分布式共形阵发展。机载分布式共形阵列雷达具有探测威力大、探测精度高、搜索视角广、隐身性能好、工作模式灵活多样等特点,是未来机载雷达发展的重要技术方向。但是,机载分布式共形阵列雷达信号处理也还存在着若干关键技术需要解决。首先,针对不规则机载平台分布式布阵很难实现均匀布阵,二是机载小平台限制了阵列

学位

基于自由能原理与自然图像统计的图像质量评价模型的研究

与本文相关的学术论文