0

I have a simple Linear model and I need to calculate the loss for it. I applied two CrossEntropyLoss and NLLLoss but I want to understand how grads are calculated on these both methods.

On the output layer, I have 4 neurons which mean I am going to classify on 4 classes.

L1 = nn.Linear(2,4)

When I use CrossEntropyLoss I get grads for all the parameters:

L1.weight.grad
tensor([[ 0.1212,  0.2424],
        [ 0.1480,  0.2961],
        [ 0.1207,  0.2414],
        [-0.3899, -0.7798]])

But when I try to use NLLLoss, I just get grad for params of true class:

L1.weight.grad
tensor([[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [-1., -2.]])

May I know how grads are calculated on both methods?

Mahdi Amrollahi
  • 263
  • 2
  • 10

0 Answers0