0

I am trying to understand the learning target of DDPM. Trying to understand $D_{KL}(q(x_{1:T}|x_0)||p_\theta(x_{1:T}|x_0)$ in following line.

$$ -\log p_\theta(x_0) <= -\log p_\theta(x_0) + D_{KL}(q(x_{1:T}|x_0)||p_\theta(x_{1:T}|x_0)) $$

$x_0$ is observered variable, and $x_T$ is variable from latent space $\sim N(0,1)$

Should $x_t$ ($t<T$) be seen as latent variables as well? But they are not following standard normal distribution.

Our target is to generate new image using $p(x|z)$, why does it learn $p(z|x)$ instead? $z$ represents latent variable and $x$ represents observable variable here.

Details: https://lilianweng.github.io/posts/2021-07-11-diffusion-models/#nice

0 Answers0