Research Stories
- Selected as an ICLR 2024 Spotlight and a CVPR 2024 Highlight, top conferences in machine learning and computer vision respectively - Anticipates a new way of representing media
Department of Electronic and Electrical Engineering
Prof.
Professors Eunbyung Park and Jong Hwan Ko
The research team led by professors Eunbyung Park and Jong Hwan Ko in the Department of Electronic and Electrical Engineering has proposed two innovative media representation methods that efficiently reconstruct complex 3D scenes using a new model structure based on neural fields. The first methodology integrates neural networks with the traditional grid-based representation method in a novel way, while the second involves representing scenes through compact 3D Gaussian representations.
1) Coordinate-Aware Modulation (CAM)
To represent 3D images or videos, typical methods extract features from a grid and then process them through a neural network. On the other hand, the approach proposed in this work fuses the feature of the grid into each layer of the neural network through a modulation operation. While the conventional use of grids requires large storage, this method uses very small grids and efficiently represents high-frequency signals.
[Figure 1] Architecture of the proposed CAM
This novel method developed by the research team has demonstrated its high performance with a significantly smaller network size when applied to various media data such as images, videos, 3D models, and 3D videos.
[Figure 2] Visualization of CAM on different domains
[Figure 3] Performance evaluation on different domains
2) Compact 3D Gaussian Splatting (C3DGS)
Recently, it became possible to achieve fast rendering speeds of over 100 FPS by representing 3D spaces as 3D Gaussian points. However, this scene representation technique requires a very large storage capacity. The research team successfully reduced the number of Gaussians used to represent space dramatically without decreasing rendering performance. Additionally, by introducing a new methodology for representing Gaussians, this method achieved not only high performance and fast rendering but also a very efficient reduction in storage space requirements.
[Figure 4] Scene representation with 3D Gaussians and C3DGS
In performance evaluations conducted with various real-world datasets, the method proposed by the research team resulted in more than a 25-fold decrease in storage requirements and an improvement in rendering speed, without compromising rendering quality.
[Figure 5] Performance evaluation on various datasets
Prof. Eunbyung Park remarked, "We have developed novel methodologies capable of representing complex 3D scenes efficiently through an innovative structure that moves away from conventional approaches. These methodologies hold significant potential for effective application in currently popular areas such as NeRF or generative models."
The research team's first study was accepted for publication at ICLR 2024 (International Conference on Learning Representations), considered one of the top academic conferences in the machine learning field alongside NeurIPS and ICML. It was selected for the Spotlight, which represents the top 6% of submitted papers. Additionally, the second study was accepted for publication at CVPR 2024 (The IEEE/CVF Conference on Computer Vision and Pattern Recognition), which is recognized as the premier academic conference in the field of computer vision. This paper was selected as a Highlight, representing the top 3% of submissions.
Paper title: Coordinate-Aware Modulation for Neural Fields
Research homepage: https://maincold2.github.io/cam/
Authors: Joo Chan Lee (First author, integrated Master's and PhD program in Dept. of Artificial Intelligence), Daniel Rho (Master's graduate in Dept. of Artificial Intelligence, currently at KT), Seungtae Nam (PhD candidate in Dept. Artificial Intelligence), Jong Hwan Ko (Corresponding author, professor in the Dept. of Electronic and Electrical Engineering), Eunbyung Park (Corresponding author, professor in the Dept. of Electronic and Electrical Engineering)
Paper title: Compact 3D Gaussian Representation for Radiance Field
Research homepage: https://maincold2.github.io/c3dgs/
Authors: Joo Chan Lee (First author, integrated Master's and PhD program in Dept. of Artificial Intelligence), Daniel Rho (Master's graduate in Dept. of Artificial Intelligence, currently at KT), Xiangyu Sun (PhD candidate in Dept. of Electrical and Computer Engineering), Jong Hwan Ko (Corresponding author, professor in the Dept. of Electronic and Electrical Engineering), Eunbyung Park (Corresponding author, professor in the Dept. of Electronic and Electrical Engineering)