VISITOR: Visual Interactive State Sequence Exploration for Reinforcement Learning

Metz, Yannick; Bykovets, Eugene; Joos, Lucas; Keim, Daniel; El-Assady, Mennatallah

VISITOR: Visual Interactive State Sequence Exploration for Reinforcement Learning

dc.contributor.author	Metz, Yannick	en_US
dc.contributor.author	Bykovets, Eugene	en_US
dc.contributor.author	Joos, Lucas	en_US
dc.contributor.author	Keim, Daniel	en_US
dc.contributor.author	El-Assady, Mennatallah	en_US
dc.contributor.editor	Bujack, Roxana	en_US
dc.contributor.editor	Archambault, Daniel	en_US
dc.contributor.editor	Schreck, Tobias	en_US
dc.date.accessioned	2023-06-10T06:17:30Z
dc.date.available	2023-06-10T06:17:30Z
dc.date.issued	2023
dc.description.abstract	Understanding the behavior of deep reinforcement learning agents is a crucial requirement throughout their development. Existing work has addressed the identification of observable behavioral patterns in state sequences or analysis of isolated internal representations; however, the overall decision-making of deep-learning RL agents remains opaque. To tackle this, we present VISITOR, a visual analytics system enabling the analysis of entire state sequences, the diagnosis of singular predictions, and the comparison between agents. A sequence embedding view enables the multiscale analysis of state sequences, utilizing custom embedding techniques for a stable spatialization of the observations and internal states. We provide multiple layers: (1) a state space embedding, highlighting different groups of states inside the state-action sequences, (2) a trajectory view, emphasizing decision points, (3) a network activation mapping, visualizing the relationship between observations and network activations, (4) a transition embedding, enabling the analysis of state-to-state transitions. The embedding view is accompanied by an interactive reward view that captures the temporal development of metrics, which can be linked directly to states in the embedding. Lastly, a model list allows for the quick comparison of models across multiple metrics. Annotations can be exported to communicate results to different audiences. Our two-stage evaluation with eight experts confirms the effectiveness in identifying states of interest, comparing the quality of policies, and reasoning about the internal decision-making processes.	en_US
dc.description.number	3
dc.description.sectionheaders	Visualization and Machine Learning
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	42
dc.identifier.doi	10.1111/cgf.14839
dc.identifier.issn	1467-8659
dc.identifier.pages	397-408
dc.identifier.pages	12 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.14839
dc.identifier.uri	https://diglib.eg.org:443/handle/10.1111/cgf14839
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by-nc/4.0/
dc.subject	CCS Concepts: Human-centered computing -> Visual analytics; Computing methodologies -> Reinforcement learning
dc.subject	Human centered computing
dc.subject	Visual analytics
dc.subject	Computing methodologies
dc.subject	Reinforcement learning
dc.title	VISITOR: Visual Interactive State Sequence Exploration for Reinforcement Learning	en_US