In many cases an activation function is notated as g (e.g. Andrew Ng's Course courses), especially if it doesn't refer to any specific activation function such as sigmoid.
However, where does this convention come from? And for what reason did g start to be used?