Is there any conceptual relationship between 'kernel' in SVM and 'kernel' in convolution neural net?

Question

In SVM, we have kernel function that maps an input raw data space into a higher dimensional feature space

In CNN, we also have a 'kernel' mask that travels the input raw data space (image as a matrix) and map it to another space.

Given the fact that both these to methods are called 'kernel', I am wondering what is the connection between them from a mathematical perspective.

My guess is that it might have something to do with functional analysis.

Interesting... it just comes to my mind that both, SVM and NN, use softmax bounderies. Would be keen to see some good answers to your question. — Peter, May 20 '19 at 08:26
As far as I know, the Kernel in the SVM is a function that maps the feature input space to a higher dimensional space in order to make it linearly separable (it is a trick...). In conv-nets, the Kernel is the patch filter used by the 2D convolutional filtering, which generates the features maps... — ignatius, May 20 '19 at 08:58
@ignatius yes, you are right. I am aware of that, which leads to my question: is there any conceptual connection of these 2 approaches ? — eight3, May 20 '19 at 09:08

score 2 · Answer 1 · answered May 21 '19 at 13:57

There is no direct relationship between these two concepts. However we can find some indirect ones.

According to Merriam Webster,

kernel means a central or essential part

which hints why they are called "kernel". Specifically, deciding "how to measure point-point similarity (a.k.a. kernel function)" is the central part of kernel methods, and deciding "what array, matrix, or tensor (a.k.a. kernel matrix) to be convoluted with a data point" is the central part of convolutional neural networks.

A kernel function receives two data points, implicitly maps them into a higher (possibly infinite) dimension, then calculates their inner product.

A kernel matrix (or array, or tensor) is convoluted with one data point to map the data point explicitly into an often lower dimension. Here, we are ignoring a subtle difference between filter and kernel (a filter is composed of one kernel per channel).

Therefore, these two concepts are indirectly related based on mapping to a new representation. However,

Kernel functions map implicitly, but kernel matrices map explicitly,
Kernel functions cannot be stacked over each other (shallow representation), but kernel matrices can be since the input and output (explicit representations) has the same structure (deep representation),
The non-linearity of map is integrated into kernel functions, but for kernel matrices, we should apply a non-linear activation function after the (input, kernel) convolution to reach a similar non-linearity,
Implicit representations cannot be learned for kernel functions, a specific function implies a specific representation. However, for kernel matrices, representations can be learned by adjusting (learning) the weights of kernels, and can also be enriched by stacking kernels over each other.

Is there any conceptual relationship between 'kernel' in SVM and 'kernel' in convolution neural net?

1 Answers1