VAE: Variational Autoencoder

Posted Jul 12, 2025 Updated Oct 16, 2025

By soohyun-chris-jeon

views 2 min read

🟣 Intro

대표적인 generative model의 첫 번째 시리즈인 VAE에 대해서 정리해보려고 한다.

VAE는 확률적 latent space를 학습하는 생성모델 (Generative Model) 이다. 기존의 AutoEncoder는 latent vector를 단순한 벡터로 압축하지만, VAE는 이를 확률분포로 본다.

처음 공부할 때 ‘Variational’이라는 단어가 쉽게 와닿지 않아서 고생했었던 모델이었다..

VAE의 목적은 다음의 evidence lower bound (ELBO)를 최대화하는 것:

\[\log p(x) \ge \mathbb{E}_{q(z|x)}[\log p(x|z)] - D_{KL}(q(z|x) \| p(z))\]

Encoder: 입력 $x$ → latent 분포 $z \sim \mathcal{N}(\mu, \sigma^2)$
Decoder: $z$ → 복원된 이미지 $\hat{x}$
Loss:
1. 복원 손실: $\|x - \hat{x}\|^2$
2. 분포 regularization (KL divergence): $D_{KL}(q(z\|x) \| p(z))$

항목	설명
장점	Latent space가 연속적, sampling 가능
단점	이미지 품질이 blurry함 (pixel-wise loss 때문)
응용	Image generation, anomaly detection, disentanglement

VAE의 핵심 아이디어는 인코더가 이미지를 잠재 공간의 특정 한 점(a single point) 으로 매핑하는 대신, 확률 분포(a probability distribution) 로 매핑하도록 해서 latent space가 부드럽게 채워지도록 했다.

이러한 아이디어는 딥러닝 공부에서 generative model의 이해를 돕고, 또한 통계적 역량을 늘릴 수 있는 유의미한 공부일 것이다.

This post is licensed under CC BY 4.0 by the author.