I am currently digging into Uncertainty Quantification and try to implement Aleatoric Uncertainty estimation into a regression model.
Given this publication we can model the Aleatoric Uncertainty by splitting the model output into $y_\text{pred}$ and $\log{\sigma^2}$ using the appropriate loss function:
$$ L_{\text{aleatoric}} = \frac{1}{2\cdot N}\cdot\sum^N_i\left(\frac{||y_i - y_{i,\text{pred}}||^2}{\sigma_i^2} + \text{log}(\sigma_i^2)\right) $$
However, my model does perform very poorly using this loss, as the actual prediction $y_\text{pred}$ is off by a factor of 2, when compared to the baseline without Aleatoric UQ.
Thus my question: Would it be a viable option to penalise the uncertainty estimation by a factor $\alpha$ to let the model focus more on the error term in the enumerator?
See below equation for how I imagine this.
$$ L_{\text{aleatoric, custom}} = \frac{1}{2\cdot N}\cdot\sum_i^N\left(\frac{||y_i - y_{i,\text{pred}}||^2}{\sigma_i^2} + \alpha \cdot \text{log}(\sigma_i^2)\right) \quad \text{with} \quad \alpha > 1.0 $$
I am not completely sure how to derive above loss function, so I am unsure towards whether my uncertainty value would still have any value if I scale the regularisation term in the loss function.