Wednesday, January 28, 2009

Progress Update

The MATLAB algorithm is up and running, albeit without normalization or thresholding. The algorithm currently creates sub maps for several fields:



Intensity
|Red - Green|
|Blue - Yellow|
Orientation: pi/4
Orientation: pi/2
Orientation: 3pi/4
Orientation: pi
Final Saliency Map


Some examples of other early results:





Wednesday, January 21, 2009

The Pyramids of Gauss



Once a saliency map is generated, a pyramid technique is applied to the image to create a sort of topological map that allows the most salient regions to be more easily identified.  The general process behind the technique is to iteratively smooth and decimate the image, creating subsequant images which are half the size.

In many cases the smoothing process is carried out by a guassian filter, which preserves edges better than a similarily sized mean filter (take the k nearest neighbors to a pixel and evenly weight them).  The guassian kernel weights the pixels according to a guassian distribution, which puts emphasis on nearby neighbors over distant ones (the degree of emphasis is controlled by the standard deviation of the kernel).


Since the gaussian is a symmetric distribution, the smoothing operation can be performed in the x and y directions seperately using a convolution. 

Example of Intensity Pyramid Scales:



Papers:

Walther, Interactions 0f the Visual Attention and Ojection Recognition: Computational Modeling, Algorithms, and Psychophysics, 2006.

Itti, Koch, Niebur, A Model of Saliency-based Visual Attention for Rapid Scene Analysis, 1998.

Sunday, January 11, 2009

Saliency Maps

My current readingy deals with the creation of saliency maps, topological maps that combine multiscale image features that correspond to attentive selection.

A brief introduction to mapping is provided by Dr. Ernst Niebur here.

For the bottom-up approach, Cristof Koch and Shimon Ullman proposed that different visual features contribute to the stimulus, such as color, intensity, orientation, and movement. The saliency map integrates this information into one global measure, essentially a topographical map. Then the most 'salient' features of the image correspond to the global maximum of this topographical map. (From my reading, the most popular method to accomplish this is a variation of the Gradient Ascent Optimization method, which follows the positive gradient to a local maximum). In the bottom approach, these regions of greatest salience are then considered in sequential order, whereas top-down would override the most salient regions in favor of more relevant areas.

Examples of a saliency map:




The above process: First image is the original picture, the three images correspond to Color, Intensity, and Orientation respectively, and then a final saliency map. Note in practice, a saliency map is often composed of even more feature maps. The final several pictures provide the regions where the focus of attention (FOA) is directed, along with the amount of simulated time to 'notice' these regions (essentially how long it would take for the imaes to pop out to the visual cortex).

A brief paper that outlines a method of creating and using Saliency Maps: A Model of Saliency-Based Visual Attention for Rapid Scene Analysis. Itti, Koch, and Niebur

Wednesday, January 7, 2009

More on Saliency

Visual saliency provides a relatively efficient way to quickly eliminate items of disinterest from the field of view.  This is largely due to the concepts behind the "bottom-up" approach mentioned in our introductory blog post.

The key objective of visual saliency is to quickly identify objects of interest.  When observing a picture such as the one below:

the object that stands out is clearly the red bar amongst the green.  This of course depends on the capabilities of the observer to distinguish colors and intensities.  Consider that without the capability to distinguish red from green, there would be no obvious item of interest in the image.

In the next image, it is difficult to spot anything of interest without searching through the image.  This is because there is little salience to guide you to the unique bar (can you find it?).  The orientation and color of most of the objects in the image are very similar and as such nothing seems too important.

Looking at the same image again, it is interesting to note what happens when the size of the viewable area is decreased:

Can you spot the unique bar now?  While it still may not be immediately obvious as in the first image, looking at a subset of the image can make determining which objects are significant easier.  This approach can be taken further:

Now it should be clear that the vertical red bar is our object of interest.  Consider what happens if the view is further restricted:


In the above image, it now seems as if the horizontal red bar is our object of interest because all other bars are vertical.  Taken one final step:

Now we cannot tell which of the two bars is significant because they are the only two items in the field of view.  This presents some interesting facts about saliency:

Saliency is not an inherent attribute of an object; saliency depends upon the combined effects of many stimuli to make a given stimulus more interesting.

Changing the scale of search can improve our ability to find more salient locations in an image though it may cause objects which are of no interest on the broad scale to suddenly seem interesting.

Monday, January 5, 2009

The Introduction

Our CSE 190 project will deal with the topic of saliency, which is the state or quality of standing out relative to neighboring items.  We intend to focus on the bottom-up approach, which is based upon stimulus-driven signals that announce that a location is sufficiently different from its surroundings to be worthy of attention.

In contrast the top-down approach focuses on object recognition using discriminant analysis.

Saliency is a plausible model for how biological systems and is considered to be a key attentional mechanism that facilitates learning and survival by enabling organisms to focus their limited perceptual and cognitive resources on the most pertinent subset of the available sensory data.

------------------------------------------

As of now our plan is to spend the next two weeks researching and learning more about saliency and the relevent algorithms associated with the bottom-up approach.  The goal of our project is to create a bottom-up saliency detector and apply it specifically to crowds, though the detector should be general enough for use in other applications.

------------------------------------------

Video of bottom-up saliency mapping on a busy freeway: