Peek-a-bot: learning through vision in Unreal Engine

Pietra, Daniele Della; Garau, Nicola; Conci, Nicola; Granelli, Fabrizio

Peek-a-bot: learning through vision in Unreal Engine

dc.contributor.author	Pietra, Daniele Della	en_US
dc.contributor.author	Garau, Nicola	en_US
dc.contributor.author	Conci, Nicola	en_US
dc.contributor.author	Granelli, Fabrizio	en_US
dc.contributor.editor	Caputo, Ariel	en_US
dc.contributor.editor	Garro, Valeria	en_US
dc.contributor.editor	Giachetti, Andrea	en_US
dc.contributor.editor	Castellani, Umberto	en_US
dc.contributor.editor	Dulecha, Tinsae Gebrechristos	en_US
dc.date.accessioned	2024-11-11T12:47:43Z
dc.date.available	2024-11-11T12:47:43Z
dc.date.issued	2024
dc.description.abstract	Humans learn to navigate and interact with their surroundings through their senses, particularly vision. Ego-vision has lately become a significant focus in computer vision, enabling neural networks to learn from first-person data effectively, as we humans do. Supervised or self-supervised learning of depth, object location and segmentation maps through deep networks has shown considerable success in recent years. On the other hand, reinforcement learning (RL) has been focusing on learning from different kinds of sensing data, such as rays, collisions, distances, and other types of observations. In this paper, we merge the two approaches, providing a complete pipeline to train reinforcement learning agents inside virtual environments, only relying on vision, eliminating the need for traditional RL observations. We demonstrate that visual stimuli, if encoded by a carefully designed vision encoder, can provide informative observations, thus replacing ray-based approaches and drastically simplifying the reward shaping typical of classical RL. Our method is fully implemented inside Unreal Engine 5, from the realtime inference of visual features to the online training of the agents' behaviour using the Proximal Policy Optimization (PPO) algorithm. To the best of our knowledge, this is the first in-engine solution targeting video games and simulation, enabling game developers to easily train vision-based RL agents without writing a single line of code. All the code, complete experiments and analysis will be available at https://mmlab-cv.github.io/Peek-a-bot/.	en_US
dc.description.sectionheaders	Virtual Training and Simulation
dc.description.seriesinformation	Smart Tools and Applications in Graphics - Eurographics Italian Chapter Conference
dc.identifier.doi	10.2312/stag.20241330
dc.identifier.isbn	978-3-03868-265-3
dc.identifier.issn	2617-4855
dc.identifier.pages	11 pages
dc.identifier.uri	https://doi.org/10.2312/stag.20241330
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/stag20241330
dc.publisher	The Eurographics Association	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.title	Peek-a-bot: learning through vision in Unreal Engine	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: stag20241330.pdf
Size:: 38.06 MB
Format:: Adobe Portable Document Format

Download

Collections

Italian Chapter Conference 2024 - Smart Tools and Apps in Graphics