论文部分内容阅读
With the advent of the information age,visual media have produced over billions of digital images,which have been gradually become an indispensable part of daily life.However,in the process of acquisition,storage,compression,and transmission,the quality of digital images is inevitably affected by different kinds of distortions.Most of the images in digital applications are produced for human consumers,and the visual quality of images has a direct impact on user experience.Measuring the quality of images is essential for designing better user systems with better visual experience.Image quality assessment(IQA)is dedicated to handle this problem,which can be divided into subjective IQA and objective IQA.Subjective IQA is generally considered to be the most accurate method since the human eye is the ultimate evaluator of visual experience.However,the slowness,expensiveness and laboriousness of subjective IQA immensely limit its applicable scenarios.Toward this end,abundant researchers have focused on objective IQA,which is classified into full-reference(FR),reduced-reference(RR)and no-reference(NR)models based on the accessibility of the reference image.Although scholars have proposed plenty of objective models to automatically and accurately predict the perceptual quality of the target image,this field of study is still in its early stage.Due to lack of knowledge about the perception mechanism of the human visual system(HVS),existing objective models have low interpretability and can not simulate the subjective perception accurately.In addition,they are mainly aimed at the simulated distortions,so their performance of them on distorted images in real-world scenarios is yet to be validated.Focusing on these issues,we conduct research on RR and NR metrics based on the free-energy principle in human visual perception mechanism and the natural image statistics modeling.This dissertation adds to this field of study with the following contributions:First,without an in-depth understanding of the perception mechanism of HVS,most objective algorithms measure the distortion degree of the signal itself to evaluate the image quality.Such models may not accurately assess the image quality since they do not take the human visual perception mechanism into consideration.Considering this issue,we present a RR IQA model based on the free-energy principle and multi-channel decomposition,which is called MCFRM(Multi-Channel Free-energy based Reducedreference quality Metric).The free-energy principle,which has been widely researched in brain science and neuroscience,is introduced to quantify the perception,action,and learning in human brain.The free-energy principle indicates that the brain resorts to the internal generative model to explain the visual stimulus and yields the corresponding prediction.The discrepancy between the visual stimulus and its prediction is highly related to the perceptual quality.On the other hand,abundant psychological and neurobiological studies have revealed that different frequency and orientation components of one visual stimulus arouse different neurons in the striate cortex,and the striate cortex processes visual information in the cerebral cortex.Inspired by these two aspects,in MCFRM,a two-level discrete Haar wavelet transform(DHWT)is employed to decompose the input reference and distorted images at first.Next,the sparse representation is used to simulate the generative model in human brain,which is proved to approximate the strategy of the primary visual cortex in representing natural images.Then,for each portion of the reference and distorted images,the difference between the visual stimulus and its prediction produced by the internal generative model in human brain is computed based on the free-energy principle.The self freeenergy features and combined free-energy features of each pair of subband reference and distorted images are extracted.Finally,we obtain the overall quality metric by regressing these features using the support vector regressor(SVR).Experimental results demonstrate that our proposed model is superior to the same kind of metrics and the time complexity of our method is lower than competitors.In addition,the proposed metric only needs four scalars extracted from the pristine image,which can alleviate the burden of information transmission for reference images.Second,since the human eye is the ultimate evaluator of visual experience,the modeling of HVS is a key issue for objective IQA and visual experience optimization.However,the traditional model based on black box fitting has low interpretability and can hardly guide the experience optimization effectively.It is difficult to integrate the model based on physiological simulation into practical visual communication services due to its high computational complexity.To bridge the gap between signal distortion and visual experience,we propose a novel perceptual NR IQA algorithm based on structural computational modeling of HVS called NSCHM(No-reference Structural Computational of Human visual system Metric)in this thesis.We deeply investigate and analyze the perception mechanism in HVS based on multi-layer representations of the image.Informed by image processing theories,we divide visual stimuli into three bottom-up layers,including a low-level visual layer,a middle-level visual layer,and a high-level visual layer.Specifically,for an image,the global image can be regarded as the high-level visual excitation.The local primitives obtained from the decomposition of the global image can be treated as the middle-level visual stimulus,while the low-level visual layer is composed of all individual pixels in the image.The mean subtracted contrast normalized(MSCN)coefficients of pixels based on natural scene statistics(NSS)are extracted as low-level visual features.The deep features based on multiple pseudo reference images through a VGG network are considered as middle-level features,which reflect the characteristics of the target image’s local primitives.The free-energy based features synthesized the properties of the whole image are treated as high-level features in the proposed method.SVR is used to aggregate these features extracted from three layers into a perception quality index.Extensive experiments illustrate that the proposed metric is effective and superior or comparable to the state-of-the-art NR models.Third,Ultra High-Definition(UHD)contents with 4K resolution have become a hot selling point of broadcast television programs and online media in recent years.However,plenty of pseudo 4K contents upscaled from lower resolutions are circulating on the network due to lower cost and technical difficulty of obtaining lower resolution content.Those“4K” programs usually frustrate the enthusiasm of consumers and waste tight bandwidth resources.To fill in this gap,this thesis puts forward an efficient NR IQA model to predict the perceptual quality of the UHD images named TPIM(True or Pseudo 4K Image quality assessment Metric),which can distinguish between true and pseudo 4K images.Due to the lack of appropriate data sets,we first establish a database of true and pseudo 4K images by subjective experiments.This material database contains 350 natural 4K images with different image contents and 2,802 pseudo 4K images upscaled from1080 p and 720 p images through fourteen various interpolation algorithms including classical interpolation models,popular deep learning interpolation algorithms,softwares and hardwares commonly applied in the industry.In TPIM,to improve the computational efficiency,we divide a complete target image into 16×9 patches.Then,we employ local variance to obtain the most three representative three sub-images on behalf of the whole target image and extract the corresponding image complexity features.Next,we calculate the histogram features of the normalized power spectrum and cut-off frequency features of the cumulative power spectrum in the frequency domain as well as the NSS based features of the MSCN coefficients on these representative patches.The prediction index of TPIM is achieved via aggregating these three groups of features using SVR.The experimental results show that the proposed model outperforms the competitive NR IQA methods in predicting the quality of 4K images and has a great ability to identify true and pseudo4 K images.It is noteworthy that the proposed algorithm has been integrated into hardware instruments and applied to practical scenarios.