Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data
| dc.contributor.author | Elsner, Tim | en_US | 
| dc.contributor.author | Usinger, Paula | en_US | 
| dc.contributor.author | Czech, Victor | en_US | 
| dc.contributor.author | Kobsik, Gregor | en_US | 
| dc.contributor.author | He, Yanjiang | en_US | 
| dc.contributor.author | Lim, Isaak | en_US | 
| dc.contributor.author | Kobbelt, Leif | en_US | 
| dc.contributor.editor | Egger, Bernhard | en_US | 
| dc.contributor.editor | Günther, Tobias | en_US | 
| dc.date.accessioned | 2025-09-24T10:37:11Z | |
| dc.date.available | 2025-09-24T10:37:11Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Quantised autoencoders usually split images into local patches, each encoded by one token. This representation is potentially inefficient, as the same number of tokens are spent per region, regardless of the visual information content in that region. To mitigate uneven distribution of information content, modern architectures provide an adaptive discretisation or add an attention mechanism to the autoencoder to infuse global information into the local tokens. Despite these improvements, tokens are still associated with a local image region. In contrast, our method is inspired by spectral decompositions which transform an input signal into a superposition of global frequencies. Taking the data-driven perspective, we train an encoder that produces a combination of tokens that are then decoded jointly, going beyond the simple linear superposition of spectral decompositions. We achieve this global description with an efficient transpose operation between features and channels and demonstrate how our global and holistic representation improves compression and can boost downstream tasks like generation. | en_US | 
| dc.description.sectionheaders | Neural and Differentiable Rendering | |
| dc.description.seriesinformation | Vision, Modeling, and Visualization | |
| dc.identifier.doi | 10.2312/vmv.20251231 | |
| dc.identifier.isbn | 978-3-03868-294-3 | |
| dc.identifier.pages | 8 pages | |
| dc.identifier.uri | https://doi.org/10.2312/vmv.20251231 | |
| dc.identifier.uri | https://diglib.eg.org/handle/10.2312/vmv20251231 | |
| dc.publisher | The Eurographics Association | en_US | 
| dc.rights | Attribution 4.0 International License | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | CCS Concepts: Computing methodologies → Image compression; Machine learning algorithms | |
| dc.subject | Computing methodologies → Image compression | |
| dc.subject | Machine learning algorithms | |
| dc.title | Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data | en_US | 
Files
Original bundle
1 - 1 of 1