3

In many cases an activation function is notated as g (e.g. Andrew Ng's Course courses), especially if it doesn't refer to any specific activation function such as sigmoid.

However, where does this convention come from? And for what reason did g start to be used?

Blaszard
  • 901
  • 1
  • 13
  • 29

1 Answers1

4

The addition of the activation layer creates a composition of two functions.

"A general function, to be defined for a particular context, is usually denoted by a single letter, most often the lower-case letters f, g, h."

So it comes down to the reason that he uses the hypothesis representation h(x)=wX+b which is a function, and that is wrapped by an activation function denoted as g. The choice of g seems to be purely alphabetical.

user4446237
  • 320
  • 1
  • 6
  • He didn't write the hypothesis function as `h`; he used `z`. So it was not intuitive for me why he used `g`... – Blaszard Nov 09 '17 at 19:28