Towards three-dimensional visual saliency
MetadataVis full innførsel
A salient image region is defined as an image part that is clearly different from its surround in terms of a number of attributes. In bottom-up processing, these attributes are defined as: contrast, color difference, brightness, and orientation. By measuring these attributes, visual saliency algorithms aim to predict the regions in an image that would attract our attention under free viewing conditions, i.e., when the observer is viewing an image without a specific task such as searching for an object. To quantify the interesting locations in a scene, the output of the visual saliency algorithms is usually expressed as a two dimensional gray scale map where the brighter regions correspond to the highly salient regions in the original image. In addition to advancing our understanding of human visual system, visual saliency models can be used for a number of computer vision applications. These applications include: image compression, computer graphics, image matching & recognition, design, and human-computer interaction. In this thesis the main contributions can be outlined as: first, we present a method to inspect the performance of Itti’s classic saliency algorithm in separating the salient and non-salient image locations. Based on our results we observed that, although the saliency model can provide a good discrimination for the highly salient and non-salient regions, there is a large overlap between the locations that lie in the middle range of saliency. Second, we propose a new bottom-up visual saliency model for static two-dimensional images. In our model, we calculate saliency by using the transformations associated with the dihedral group D4. Our results suggest that the proposed saliency model outperforms many state-of-the-art saliency models. By using the proposed methodology, our algorithm can be extended to calculate saliency in three-dimensional scenes, which we intend to implement in the future. Third, we propose a way to perform statistical analysis of the fixations data from different observers and different images. Based on the analysis, we present a robust metric for judging the performance of the visual saliency algorithms. Our results show that the proposed metric can indeed be used to alleviate the problems pertaining to the evaluation of saliency models. Four, we introduce a new approach to compress an image based on the salient locations predicted by the saliency models. Our results show that the compressed images do not exhibit visual artifacts and appear to be very similar to the originals. Five, we outline a method to estimate depth from eye fixations in three-dimensional virtual scenes that can be used for creating so-called gaze maps for three-dimensional scenes. In the future, this can be used as ground truth for judging the performance of saliency algorithms for three-dimensional images. We believe that our contributions can lead to a better understanding of saliency, address the major issues associated with the evaluation of saliency models, highlight on the contribution of top-down and bottom-up processing based on the analysis of a comprehensive eye tracking dataset, promote use of human vision steered image processing applications, and pave the way for calculating saliency in three-dimensional scenes.
Består avAlsam, Ali; Sharma, Puneet. Analysis of eye fixations data. Proceedings of the 13th IASTED International Conference on Signal and Image Processing: 342-349, 2011.
Sharma, Puneet; Ali, Alsam. A robust metric for the evaluation of visualsaliency models. Proceedings of the 9th International Conference on Computer Vision Theory and Applications: 654-661, 2014.
Alsam, Ali; Sharma, Puneet. Robust metric for the evaluation of visual saliency algorithms. Journal of the Optical Society of America A. 31(3): 532-540, 2014. 10.1364/JOSAA.31.000532.
Alsam, Ali; Sharma, Puneet. Validating the Visual Saliency Model. Image Analysis ; 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings: 153-161, 2013. 10.1007/978-3-642-38886-6_15.
Alsam, Ali; Sharma, Puneet; Wrålsen, Anette. Asymmetry as a Measure of Visual Saliency. Image Analysis ; 18th Scandinavian Conference, SCIA 2013, Espoo, Finland, June 17-20, 2013. Proceedings: 591-600, 2013. 10.1007/978-3-642-38886-6_55.
Alsam, Ali; Sharma, Puneet; Wrålsen, Anette. Calculating Saliency Using the Dihedral Group D 4. Journal of Imaging Science and Technology. (ISSN 1062-3701). 58(1): 10504-1-10504-12, 2014. 10.2352/J.ImagingSci.Technol.2014.58.1.010504.
Alsam, Ali; Rivertz, Hans Jakob; Sharma, Puneet. What the Eye Did Not See – A Fusion Approach to Image Coding. Advances in Visual Computing ; 8th International Symposium, ISVC 2012, Rethymnon, Crete, Greece, July 16-18, 2012, Revised Selected Papers, Part II: 199-208, 2012. 10.1007/978-3-642-33191-6_20.
Alsam, Ali; Rivertz, Hans Jakob; Sharma, Puneet. What the eye did not see--a fusion approach to image coding. International journal on artificial intelligence tools. (ISSN 0218-2130). 22(6), 2013. 10.1142/S0218213013600142.
Sharma, Puneet; Nilsen, Jan Harald; Skramstad, Torbjørn; Alaya Cheikh, Faouzi. Evaluation of Geometric Depth Estimation Model for Virtual Environment. Norsk Informatikkonferanse NIK-2010: 166-177, 2010.
Sharma, Puneet; Alsam, Ali. ESTIMATING THE DEPTH IN THREE-DIMENSIONAL VIRTUAL ENVIRONMENT WITH FEEDBACK. Proceedings of the 14th IASTED International Conference on Signal and Image Processing: 9-17, 2012.
Sharma, Puneet; Alsam, Ali. ESTIMATING THE DEPTH UNCERTAINTY IN THREE-DIMENSIONAL VIRTUAL ENVIRONMENT. Proceedings of the 14th IASTED International Conference on Signal and Image Processing: 18-25, 2012.