Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data

Elsner, Tim; Usinger, Paula; Czech, Victor; Kobsik, Gregor; He, Yanjiang; Lim, Isaak; Kobbelt, Leif

Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data

dc.contributor.author	Elsner, Tim	en_US
dc.contributor.author	Usinger, Paula	en_US
dc.contributor.author	Czech, Victor	en_US
dc.contributor.author	Kobsik, Gregor	en_US
dc.contributor.author	He, Yanjiang	en_US
dc.contributor.author	Lim, Isaak	en_US
dc.contributor.author	Kobbelt, Leif	en_US
dc.contributor.editor	Egger, Bernhard	en_US
dc.contributor.editor	Günther, Tobias	en_US
dc.date.accessioned	2025-09-24T10:37:11Z
dc.date.available	2025-09-24T10:37:11Z
dc.date.issued	2025
dc.description.abstract	Quantised autoencoders usually split images into local patches, each encoded by one token. This representation is potentially inefficient, as the same number of tokens are spent per region, regardless of the visual information content in that region. To mitigate uneven distribution of information content, modern architectures provide an adaptive discretisation or add an attention mechanism to the autoencoder to infuse global information into the local tokens. Despite these improvements, tokens are still associated with a local image region. In contrast, our method is inspired by spectral decompositions which transform an input signal into a superposition of global frequencies. Taking the data-driven perspective, we train an encoder that produces a combination of tokens that are then decoded jointly, going beyond the simple linear superposition of spectral decompositions. We achieve this global description with an efficient transpose operation between features and channels and demonstrate how our global and holistic representation improves compression and can boost downstream tasks like generation.	en_US
dc.description.sectionheaders	Neural and Differentiable Rendering
dc.description.seriesinformation	Vision, Modeling, and Visualization
dc.identifier.doi	10.2312/vmv.20251231
dc.identifier.isbn	978-3-03868-294-3
dc.identifier.pages	8 pages
dc.identifier.uri	https://doi.org/10.2312/vmv.20251231
dc.identifier.uri	https://diglib.eg.org/handle/10.2312/vmv20251231
dc.publisher	The Eurographics Association	en_US
dc.rights	Attribution 4.0 International License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	CCS Concepts: Computing methodologies → Image compression; Machine learning algorithms
dc.subject	Computing methodologies → Image compression
dc.subject	Machine learning algorithms
dc.title	Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: vmv20251231.pdf
Size:: 1.66 MB
Format:: Adobe Portable Document Format

Download

Collections

VMV2025