I am trying to understand the learning target of DDPM. Trying to understand $D_{KL}(q(x_{1:T}|x_0)||p_\theta(x_{1:T}|x_0)$ in following line.
$$ -\log p_\theta(x_0) <= -\log p_\theta(x_0) + D_{KL}(q(x_{1:T}|x_0)||p_\theta(x_{1:T}|x_0)) $$
$x_0$ is observered variable, and $x_T$ is variable from latent space $\sim N(0,1)$
Should $x_t$ ($t<T$) be seen as latent variables as well? But they are not following standard normal distribution.
Our target is to generate new image using $p(x|z)$, why does it learn $p(z|x)$ instead? $z$ represents latent variable and $x$ represents observable variable here.
Details: https://lilianweng.github.io/posts/2021-07-11-diffusion-models/#nice