SCENE-pathy: Capturing the Visual Selective Attention of People Towards Scene Elements

Andrea Toaiari1, Federico Cunico1, Francesco Taioli1, Ariel Caputo1, Gloria Menegaz1, Andrea Giachetti1, Giovanni Maria Farinella2, Marco Cristani1
1 University of Verona, 2 University of Catania
Teaser of SCENE-pathy

(a) The 3D model of the SCENE-pathy showroom scenario, with the 9 possible areas of interest highlighted by colored bounding boxes. (b), (c) On the left an input video frame, in which we highlight a single subject. On the right, the corresponding VSA estimation is in the form

Abstract

We present SCENE-pathy, a dataset and a set of baselines to study the visual selective attention (VSA) of people towards the 3D scene in which they are located. In practice, VSA allows to discover which parts of the scene are most attractive for an individual. Capturing VSA is of primary importance in the fields of marketing, retail management, surveillance, and many others. So far, VSA analysis focused on very simple scenarios: a mall shelf or a tiny room, usually with a single subject involved. Our dataset, instead, considers a multi-person and much more complex 3D scenario, specifically a high-tech fair showroom presenting machines of an Industry 4.0 production line, where 25 subjects have been captured for 2 minutes each when moving, observing the scene, and having social interactions. Also, the subjects filled out a questionnaire indicating which part of the scene was most interesting for them. Data acquisition was performed using Hololens 2 devices, which allowed us to get ground-truth data related to people’s tracklets and gaze trajectories. Our proposed baselines capture VSA from the mere RGB video data and a 3D scene model, providing interpretable 3D heatmaps. In total, there are more than 100K RGB frames with, for each person, the annotated 3D head positions and the 3D gaze vectors.

Video

BibTeX


      @InProceedings{10.1007/978-3-031-43148-7_30,
        author="Toaiari, Andrea
        and Cunico, Federico
        and Taioli, Francesco
        and Caputo, Ariel
        and Menegaz, Gloria
        and Giachetti, Andrea
        and Farinella, Giovanni Maria
        and Cristani, Marco",
        editor="Foresti, Gian Luca
        and Fusiello, Andrea
        and Hancock, Edwin",
        title="SCENE-pathy: Capturing the Visual Selective Attention of People Towards Scene Elements",
        booktitle="Image Analysis and Processing -- ICIAP 2023",
        year="2023",
        publisher="Springer Nature Switzerland",
        address="Cham",
        pages="352--363",
        isbn="978-3-031-43148-7"
        }