PyTorch CrossEntropyLoss and Log_SoftMAx + NLLLoss give different results

Question

As per PyTorch documentation CrossEntropyLoss() is a combination of LogSoftMax() and NLLLoss() function. However, calling CrossEntropyLoss() gives different results compared to calling LogSoftMax() and NLLLoss() as seen from the output of the given code.

What could be causing different results here ?

Cross Entropy from PyTorch: tensor(2.3573)

Cross Entropy from Manual_PyTorch_NNLoss: tensor(1.0137)

def CrossEntropyPyTorch(values, actualProb):
    tensorValues = torch.FloatTensor(values)
    tensorActualProb = torch.FloatTensor(actualProb) 
    criterion = nn.CrossEntropyLoss() #LogSoftMax + NNLoss
    loss = criterion(tensorValues, tensorActualProb)
    return loss

def CrossEntropyManual_PyTorch_NNLoss(values, actualProb):
    tensor = torch.FloatTensor(values) 
    tensorValues = nn.LogSoftmax()(tensor)
    #Apply NNLoss
    criterion = nn.NLLLoss()
    tensorActualProb = torch.LongTensor(actualProb)
    loss = criterion(tensorValues, tensorActualProb)
    return loss

    
if __name__ == '__main__':
    values = [-.03, .4, .5] 
    actualProb = [1,0,1]

    print("Cross Entropy from PyTorch:",CrossEntropyPyTorch(values,actualProb))
    print("Cross Entropy from Manual_PyTorch_NNLoss:",CrossEntropyManual_PyTorch_NNLoss(values,actualProb))

noe · Accepted Answer · 2023-04-27T16:40:29.087

There are a few of problems here:

actualProb is not a valid categorical probability distribution because the values don't add up to 1.
You are converting probabilities to integers by invoking torch.LongTensor.
nn.NLLLoss is meant to receive the class indices, not the probabilities (I guess that's why you used torch.LongTensor).
While CrossEntropyLoss accepts both probabilities and class indices, its documentation specifies that it is only equivalent to LogSoftMax and nn.NLLLoss for the case of indices.

Here is an amended example:

import torch
from torch import nn

def CrossEntropyPyTorch(values, actualClass):
    tensorValues = torch.FloatTensor(values)
    tensorActualClass = torch.LongTensor(actualClass) 
    criterion = nn.CrossEntropyLoss() #LogSoftMax + NNLoss
    loss = criterion(tensorValues, tensorActualClass)
    return loss

def CrossEntropyManual_PyTorch_NNLoss(values, actualClass):
    tensor = torch.FloatTensor(values) 
    tensorValues = nn.LogSoftmax()(tensor)
    #Apply NNLoss
    criterion = nn.NLLLoss()
    tensorActualClass = torch.LongTensor(actualClass)
    loss = criterion(tensorValues, tensorActualClass)
    return loss

    
if __name__ == '__main__':
    values = [[-.03, .4, .5]]
    actualClass = [2] # the correct option is the second class

    print("Cross Entropy from PyTorch:", CrossEntropyPyTorch(values, actualClass))
    print("Cross Entropy from Manual_PyTorch_NNLoss:",CrossEntropyManual_PyTorch_NNLoss(values, actualClass))

It's output is:

Cross Entropy from PyTorch: tensor(0.9137)
Cross Entropy from Manual_PyTorch_NNLoss: tensor(0.9137)

Thank you Sir ! That works and makes is very clear. – cbelwal Apr 27 '23 at 16:25 — cbelwal, Apr 27 '23 at 16:25

PyTorch CrossEntropyLoss and Log_SoftMAx + NLLLoss give different results

1 Answers1