Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data

Elsner, Tim; Usinger, Paula; Czech, Victor; Kobsik, Gregor; He, Yanjiang; Lim, Isaak; Kobbelt, Leif

Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data

Files

vmv20251231.pdf (1.66 MB)

Date

2025

Authors

Elsner, Tim
Usinger, Paula
Czech, Victor
Kobsik, Gregor
He, Yanjiang
Lim, Isaak
Kobbelt, Leif

Publisher

The Eurographics Association

Abstract

Quantised autoencoders usually split images into local patches, each encoded by one token. This representation is potentially inefficient, as the same number of tokens are spent per region, regardless of the visual information content in that region. To mitigate uneven distribution of information content, modern architectures provide an adaptive discretisation or add an attention mechanism to the autoencoder to infuse global information into the local tokens. Despite these improvements, tokens are still associated with a local image region. In contrast, our method is inspired by spectral decompositions which transform an input signal into a superposition of global frequencies. Taking the data-driven perspective, we train an encoder that produces a combination of tokens that are then decoded jointly, going beyond the simple linear superposition of spectral decompositions. We achieve this global description with an efficient transpose operation between features and channels and demonstrate how our global and holistic representation improves compression and can boost downstream tasks like generation.

CCS Concepts: Computing methodologies → Image compression; Machine learning algorithms

        @inproceedings{10.2312:vmv.20251231
,
booktitle = {Vision, Modeling, and Visualization
},
editor = {Egger, Bernhard and 
Günther, Tobias
},
title = {{Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data
}},
author = {Elsner, Tim and 
Usinger, Paula and 
Czech, Victor and 
Kobsik, Gregor and 
He, Yanjiang and 
Lim, Isaak and 
Kobbelt, Leif
},
year = {2025
},
publisher = {The Eurographics Association
},
ISBN = {978-3-03868-294-3
},
DOI = {10.2312/vmv.20251231
}
}

URI

https://doi.org/10.2312/vmv.20251231
https://diglib.eg.org/handle/10.2312/vmv20251231

Collections

VMV2025

Full item page