Is someone familiar with such an approach:
Suppose I want to build a bayesian neural network, with distributions over my parameters instead of point estimates. First I train my network with standard backprop. After training I start some MCMC sampling over my network weights in order to estimate the distributions.
Somehow this approach seems very natural to me and probably there are already some papers or drawbacks, which I do not see. I am aware that sampling can be a slow process, but with a good starting point it should run quite fast. Also sampling could be run in batchmode in order to speed up.