Squeeze3D
Your 3D Generation Model is Secretly an Extreme Neural Compressor

Rishit Dagli Yushi Guan Sankeerth Durvasula Mohammadreza Mofayezi Nandita Vijaykumar

University of Toronto

Paper

arXiv Code

Models

Datasets

Demo (Coming Soon)

We propose Squeeze3D that learns to bridge the latent spaces of different models. This allows us to do extreme compression.

Squeeze3D teaser — **Figure:** We showcase extreme compression of 3D models while preserving perceptual quality. **(Top)** Our method compresses a diverse collection of 3D models in various formats: meshes, point clouds, and radiance fields. **(Bottom)** Detailed comparison between the original "Pikachu" model (6.11 MB) and the reconstruction after compression. The object was compressed to merely 0.003 MB.

Abstract

We propose Squeeze3D, a novel framework that leverages implicit prior knowledge learnt by existing pre-trained 3D generative models to compress 3D data at extremely high compression ratios. Our approach bridges the latent spaces between a pre-trained encoder and a pre-trained generation model through trainable mapping networks. Any 3D model represented as a mesh, point cloud, or a radiance field is first encoded by the pre-trained encoder and then transformed (i.e. compressed) into a highly compact latent code. This latent code can effectively be used as an extremely compressed representation of the mesh or point cloud. A mapping network transforms the compressed latent code into the latent space of a powerful generative model, which is then conditioned to recreate the original 3D model (i.e. decompression). Squeeze3D is trained entirely on generated synthetic data and does not require any 3D datasets. The Squeeze3D architecture can be flexibly used with existing pre-trained 3D encoders and existing generative models. It can flexibly support different formats, including meshes, point clouds, and radiance fields. Our experiments demonstrate that Squeeze3D achieves compression ratios of up to 2187× for textured meshes, 55× for point clouds, and 619× for radiance fields while maintaining visual quality comparable to many existing methods. Squeeze3D only incurs a small compression and decompression latency since it does not involve training object-specific networks to compress an object.

🚀Key Contributions

🎯 To the best of our knowledge, this is the first framework that leverages pre-existing pre-trained generative models to enable extreme compression of 3D data.

🔗 We demonstrate the feasibility of establishing correspondences between disparate latent manifolds originating from neural architectures with fundamentally different structures, optimization objectives, and training distributions.

📊 We evaluate Squeeze3D for mesh, point cloud, and radiance field compression and demonstrate that generative models are a promising approach for extreme compression of 3D models. Squeeze3D can be flexibly extended to different encoders, generative models, and 3D formats.

Squeeze3D

Squeeze3D achieves extreme compression by bridging the latent spaces of pre-trained 3D encoders and generators through lightweight mapping networks. Rather than training specialized compression models for each object, we leverage the implicit knowledge already learned by existing 3D generative models.

Key Components

The architecture consists of four main components:

Pre-trained 3D Encoder (E): Converts 3D geometries into latent representations z_E
Forward Mapping Network (F^E_θ): Transforms encoder latents into compressed representation z_comp
Reverse Mapping Network (F^D_θ): Maps compressed representation to generator latent space z_G
Pre-trained 3D Generator (G): Reconstructs 3D geometry from latent codes

Training Process

Since we cannot directly obtain paired data between encoder and generator latent spaces, we employ a synthetic data generation approach:

Gram Loss: We discovered that standard training leads to redundant latent representations. Our Gram loss enforces orthogonality in the compressed space, ensuring all dimensions are effectively utilized.
Format Agnostic: By using pre-trained encoders, Squeeze3D can compress any 3D format without format-specific training.
Extreme Compression: The compressed representation z_comp can be as small as 128 dimensions, achieving compression ratios up to 2187×.

The entire compression and decompression process is fast, requiring only forward passes through small neural networks, making it practical for real-world applications.

Main Results

Comparison with Previous Works

Mesh compression comparison — **Figure:** Qualitative mesh compression results. We compare Squeeze3D to state-of-the-art methods. Our approach maintains visually important geometric details.

Point cloud compression comparison — **Figure:** Qualitative point cloud compression results. We show qualitative results comparing Squeeze3D to state-of-the-art methods. Our approach achieves significantly higher compression ratios while maintaining perceptually important geometric details.

Radiance field compression comparison — **Figure:** Qualitative radiance field compression results. We show qualitative results comparing Squeeze3D to state-of-the-art methods. Our approach achieves a significantly higher compression ratio while maintaining visually important geometric details.

Mesh Results

Comparisons between ground truth 3D meshes and those generated by Squeeze3D. Interact with each 3D model by dragging to rotate, scrolling to zoom.

Previous Next

Radiance Field Results

Comparisons between ground truth radiance fields (left) and those generated by Squeeze3D (right). These radiance fields are randomly chosen and shown from a random view from the test set. Drag the slider to compare results.

Point Cloud Results

Comparisons between ground truth point clouds and those generated by Squeeze3D. Interact with each point cloud by dragging to rotate, scrolling to zoom.

Previous Next

Additional Results and Analysis

Results Library (17.3 MB)

Interpolation results — **Figure:** **Interpolation.** The compressed representations we obtain can also be interpolated. In these examples we obtain the compressed representation for the leftmost and rightmost meshes and linearly interpolate between them.

Multiple camera angles — **Figure:** **Consistent reconstruction across views.** We show a mesh reconstructed with our method through multiple camera angles to demonstrate that our approach learns a correct transformation between the latent spaces and the reconstruction is consistent.

**Figure:** **Textureless mesh compression.** We observe that our method learns to effectively represent intricate geometrical details even for textureless meshes.

Different 3D generators — **Figure:** **Compression results using different 3D generators.** Squeeze3D is agnostic to the choice of a 3D generation model. Thus, we show compression results with the 3D generators: OpenLRM, and Shap-E. We choose 3D meshes that lie in the representation capacity of the chosen 3D generators.

Complex mesh compression — **Figure:** **Compressing Complex Meshes.** Squeeze3D can be used to compress highly complex textured 3D meshes (in this case 77851 vertices and 120812 faces).

Paper