Should the bias value be added after convolution operation in CNNs?

Question

Should we add bias to each entry of the convolution then sum, or add bias once at end of calculating the convolution in CNNs?

what do you mean by "sum"? Can you express your question mathematically, its a bit vague from words. — Louis T, Nov 08 '17 at 21:57

score 7 · Accepted Answer · answered Nov 08 '17 at 21:34

Short answer: the bias is added once after the convolution has been calculated.

Long answer: discrete convolution that you see in CNNs is a linear function applied to pixel values in a small region of an image. The output of this linear function is then jammed through some nonlinearity (like ReLU). For a region $\mathbf{x}$ of size $i \times j$ of an image and a convolutional filter $\mathbf{k}$, and no bias term, this linear function $f$ would be defined as:

$$ f(\mathbf{x}, \mathbf{k}) = \mathbf{x}*\mathbf{k} = \sum_{i,j} k_{i,j} x_{i,j} $$

Without a bias term, this linear function $f$ must go through the origin. In other words, if $\mathbf{x}$ or $\mathbf{k}$ is all zeroes, the output of $f$ will be zero as well. This may not be desirable, so we add a bias term $b$. This gives the model more flexibility by providing a value that is always added to the output of the convolution, regardless of the values of $\mathbf{x}$ and $\mathbf{k}$ -- in other words, it's the intercept value.

$$ f(\mathbf{x}, \mathbf{k}, b) = b + (\mathbf{x}*\mathbf{k}) = b + \sum_{i,j} k_{i,j} x_{i,j} $$

If this value was added to each entry of the convolution, it would not achieve its purpose as $f$ would still necessarily go through the origin.

score 1 · Answer 2 · answered Apr 27 '18 at 19:46

1

Based on the answer here and the blog post here there are two variants for using bias in convolutional layers. Tied biases if you use one bias per convolutional filter/kernel and untied biases if you use one bias per kernel and output location.

answered Apr 27 '18 at 19:46

Green Falcon

13,868
9
55
98

Should the bias value be added after convolution operation in CNNs?

2 Answers2

Linked

Related