0

It is possible to 'initialise' a gradient boosted model with a simpler model, such as linear regression, by manually setting the initial score. This seems to help reduce the discrepencies between the two models (and the so-called 'hallucinations' of the ML model). I was wondering what are the strategies that would allow to make a nn closer to a linear regression.

I can imagine some solutions:

  • Weights initialisation: if the last layer is initialised as the linear model and the previous layers are close to identity, I would imagine the output of the initial model being close to the linear model.
  • Linear Penalisation: add a penalisation term that would bring the output closers to the linear model ones.
  • Penalisation in general: starting with the reduction in size, I can imagine that a significantly 'hadicapped' NN won't be that far off a linear model.

Are these valid (and documented) strategies ? Are there other alternatives ?

Lucas Morin
  • 2,513
  • 5
  • 19
  • 39

0 Answers0