4

I am having a hard time understanding the strategy for inputting the color. Most tutorials on RBMs only train grayscale images.

If the image is grayscale, the input units can be binary, and I can normalize the gray scale value to [0,1], and then treat them like probabilities in the input layer. Or whiten the dataset and use Gaussian units in the input layer.

How do I treat color images? Obviously, the input units cannot be binary - unless I replicate the units for each of the three color channels? Or what is the popular strategy?

smörkex
  • 141
  • 3

2 Answers2

1

I am working on a similar project. The idea is still under experiment so I do not guarantee results. You may convert your images to grey scaled images having 8 levels, then you can use these 8 levels as a visible layer connected to hidden layer having (width * height) of your images, then use this hidden layer as a visible layer to a hidden layer with the number of your classes.

Ethan
  • 1,625
  • 8
  • 23
  • 39
0

If your visible units are real-valued then your conditionals change a bit.

Like shown in (Lee, Honglak and Ekanadham, Chaitanya and Ng, Andrew, 2007):

P(vi}=1 | h) = Ν(ci + Σj Wi,j . hj , σ2) --> Where N(mean, sttdev) is a Gaussian distribution.

P(hj}=1 | v) = σ( (bj + Σi Wi,j . vi) / σ2 ) --> Where σ is sigmoid function

σ2 is a hyperparameter (working as standard deviation here)

PS: Is not here, but your h should sample from a Bernoulli distribution (since it is a binary layer) during Gibbs Sampling.

With this you keep your visible unit real-valued, but be careful sampling from a Gaussian dist. after a reconstruction will get you values < 0 and > 1.

So a normalization of visible units is needed to get reasonable results.

I didn't tested much with RBMs because they don't work with filters and don't preserve spatial features since they hidden layers are, basically, a Dense layer.

Probably will be better to work on Convolutional RBMs where you can transform a RGB image (3 channels) into a binary hidden layer with more than 3 channels/filters trying to represent your "colors".

PS: In these type of RBMs your conditionals will not change much, just replace matrix multiplication for a convolution (in the second will be a transposed convolution)

I'm writing some code here to work with RBMs, CRBMs, Deep Belief Networks using Tensorflow 2 so if you or someone else needs help with code and/or theory you can check the repository and ask me below on comments.