0
import torch.nn.functional as F

logits = torch.Tensor([0, 1])

counts = logits.exp()

probs = counts / counts.sum() # equivalent to softmax

loss = F.cross_entropy(logits, probs)

Here, loss is roughly equal to 0.5822.

However, I would expect it to be 0.

If I understand the docs correctly, torch.nn.functional.cross_entropy can accept an array of logits and an array of probabilities as its input and target parameters, respectively (converted to pytorch.Tensor's).

I believe probs to be the true distribution, and that F.cross_entropy therefore should return 0.

Why is loss not 0?

1 Answers1

0

Pytorch treats your logits as outputs that it will first convert to probabilities by running them through a softmax:

p_log = torch.log(F.softmax(logits, dim=0))
torch.dot(p_log, probs)

tensor(0.5822)

Some discussion here - https://discuss.pytorch.org/t/why-does-crossentropyloss-include-the-softmax-function/4420. I think the naming could be clearer, for example in tensorflow it is tf.nn.softmax_cross_entropy_with_logits.

brewmaster321
  • 655
  • 1
  • 9