I have a simple Linear model and I need to calculate the loss for it. I applied two CrossEntropyLoss and NLLLoss but I want to understand how grads are calculated on these both methods.
On the output layer, I have 4 neurons which mean I am going to classify on 4 classes.
L1 = nn.Linear(2,4)
When I use CrossEntropyLoss I get grads for all the parameters:
L1.weight.grad
tensor([[ 0.1212, 0.2424],
[ 0.1480, 0.2961],
[ 0.1207, 0.2414],
[-0.3899, -0.7798]])
But when I try to use NLLLoss, I just get grad for params of true class:
L1.weight.grad
tensor([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[-1., -2.]])
May I know how grads are calculated on both methods?