Once a saliency map is generated, a pyramid technique is applied to the image to create a sort of topological map that allows the most salient regions to be more easily identified. The general process behind the technique is to iteratively smooth and decimate the image, creating subsequant images which are half the size.
In many cases the smoothing process is carried out by a guassian filter, which preserves edges better than a similarily sized mean filter (take the k nearest neighbors to a pixel and evenly weight them). The guassian kernel weights the pixels according to a guassian distribution, which puts emphasis on nearby neighbors over distant ones (the degree of emphasis is controlled by the standard deviation of the kernel).
Since the gaussian is a symmetric distribution, the smoothing operation can be performed in the x and y directions seperately using a convolution.
Example of Intensity Pyramid Scales:
Papers:
Walther, Interactions 0f the Visual Attention and Ojection Recognition: Computational Modeling, Algorithms, and Psychophysics, 2006.
Itti, Koch, Niebur, A Model of Saliency-based Visual Attention for Rapid Scene Analysis, 1998.
No comments:
Post a Comment