A machine learning framework for gaze guidance
A machine learning framework for gaze guidance (Presented at the European Conference on Visual Perception 2009)
Eleonora Vig, Michael Dorr, Karl Gegenfurtner, and Erhardt Barth
What constitutes the difference between fixated and non-fixated movie
patches? How can we change a patch
to make it more or less salient? Here, we present a novel computational
model of low-level saliency with dual emphasis: the same machine
learning framework is used (i) for predicting saccade targets in natural
dynamic scenes, and (ii) for learning how to alter the saliency level of
these targets.
We use a large data set of eye movements on high-resolution videos
of
natural scenes. The 40,000 detected saccades are used to label movie
patches as attended and non-attended. The proposed saliency measure,
spectral energy, is computed in the neighborhood of each location on
each scale
of an anisotropic spatio-temporal Laplacian pyramid. On this simple
low-dimensional representation of a patch (only one value, the spectral
energy, per scale) we train a support vector machine, which outperforms
state-of-the-art saliency predictors, reaching an ROC score of 0.8.
Furthermore, we use this classifier to derive transformations in the
energy profiles that alter the saliency distribution of the scene.
Preliminary results show that gaze-contingent energy modifications
do indeed have a gaze guiding effect.