Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data

dc.contributor.authorElsner, Timen_US
dc.contributor.authorUsinger, Paulaen_US
dc.contributor.authorCzech, Victoren_US
dc.contributor.authorKobsik, Gregoren_US
dc.contributor.authorHe, Yanjiangen_US
dc.contributor.authorLim, Isaaken_US
dc.contributor.authorKobbelt, Leifen_US
dc.contributor.editorEgger, Bernharden_US
dc.contributor.editorGünther, Tobiasen_US
dc.date.accessioned2025-09-24T10:37:11Z
dc.date.available2025-09-24T10:37:11Z
dc.date.issued2025
dc.description.abstractQuantised autoencoders usually split images into local patches, each encoded by one token. This representation is potentially inefficient, as the same number of tokens are spent per region, regardless of the visual information content in that region. To mitigate uneven distribution of information content, modern architectures provide an adaptive discretisation or add an attention mechanism to the autoencoder to infuse global information into the local tokens. Despite these improvements, tokens are still associated with a local image region. In contrast, our method is inspired by spectral decompositions which transform an input signal into a superposition of global frequencies. Taking the data-driven perspective, we train an encoder that produces a combination of tokens that are then decoded jointly, going beyond the simple linear superposition of spectral decompositions. We achieve this global description with an efficient transpose operation between features and channels and demonstrate how our global and holistic representation improves compression and can boost downstream tasks like generation.en_US
dc.description.sectionheadersNeural and Differentiable Rendering
dc.description.seriesinformationVision, Modeling, and Visualization
dc.identifier.doi10.2312/vmv.20251231
dc.identifier.isbn978-3-03868-294-3
dc.identifier.pages8 pages
dc.identifier.urihttps://doi.org/10.2312/vmv.20251231
dc.identifier.urihttps://diglib.eg.org/handle/10.2312/vmv20251231
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Computing methodologies → Image compression; Machine learning algorithms
dc.subjectComputing methodologies → Image compression
dc.subjectMachine learning algorithms
dc.titleQuantised Global Autoencoder: A Holistic Approach to Representing Visual Dataen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
vmv20251231.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format
Collections