Questions tagged [cross-entropy]
20 questions
1
vote
1 answer
PyTorch CrossEntropyLoss and Log_SoftMAx + NLLLoss give different results
As per PyTorch documentation CrossEntropyLoss() is a combination of LogSoftMax() and NLLLoss() function. However, calling CrossEntropyLoss() gives different results compared to calling LogSoftMax() and NLLLoss() as seen from the output of the given…
cbelwal
- 113
- 3
1
vote
1 answer
Loss on whole sequences in Causal Language Model
I'd like to know, from an implementation point of view, when training a Causal Transformer such as GPT-2, if making prediction on whole sequence at once and computing the loss on the whole sequence is something standard ?
When going across examples…
Valentin Macé
- 137
- 4
1
vote
0 answers
Loss function for classification problem
So I'm working on a classification problem, I used convolutional neural networks to classify grayscale ECG beat images of dimension 200x200 (I had around 4000 images for each class in training and I had 4 classes), the model is shown below:
I'm…
imene
- 23
- 3
1
vote
0 answers
Shannon Information Content related to Uncertainty?
I'm a data scientist student currently writing my master thesis which resolves around the Cross Entropy (CE) Loss Function for neural networks. From my understanding, the CE is based on the Entropy, which in turn is based on the Shannon Information…
xflashx
- 11
- 1
1
vote
2 answers
Why is cross entropy based on Bernoulli or Multinoulli probability distribution?
When we use logistic regression, we use cross entropy as the loss function. However, based on my understanding and https://machinelearningmastery.com/cross-entropy-for-machine-learning/, cross entropy evaluates if two or more distributions are…
Feng Chen
- 207
- 1
- 9
0
votes
0 answers
Loss function for classifcation rewarding closer guess?
The default loss function in multi class classification is cross_entropy, which treats all wrong guesses equally. If the distance between buckets are meaningful, for example, given the real bucket is 5, the guess 6 is considered 3 times better than…
jerron
- 1
- 1
0
votes
0 answers
Is there a canonical cross entropy from the confusion matrix?
In
Wu, MT. Confusion matrix and minimum cross-entropy metrics based motion recognition system in the classroom. Sci Rep 12, 3095 (2022). https://doi.org/10.1038/s41598-022-07137-z
the author uses a peculiar "cross-entropy formula",
$$
L=-\sum_i…
arivero
- 101
- 1
0
votes
0 answers
Train neural network to predict multiple distributions
I aim to train a neural network to predict 2 distributions (10 quantiles, i.e. deciles) at 5 time points. So my y is of shape:
(batch size, time points, distribution values) => (batch size, 5, 20)
The distributions are standard quantiles, summing to…
A_Murphy
- 30
- 5
0
votes
1 answer
Why is the calculated cross-entropy not zero?
import torch.nn.functional as F
logits = torch.Tensor([0, 1])
counts = logits.exp()
probs = counts / counts.sum() # equivalent to softmax
loss = F.cross_entropy(logits, probs)
Here, loss is roughly equal to 0.5822.
However, I would expect it to…
Tsiolkovsky
- 3
- 2
0
votes
1 answer
Why is cross entropy loss averaged and not used directly as a sum during model training(such as in neural networks)
Why is the cross entropy loss for all training examples(or the training examples in a batch) averaged over size of the training set(or batch size) ?
Why is it not just summed and used ?
0
votes
0 answers
Getting Error: TypeError: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not tuple
I am working on a CNN multi-class classification of different concentrations (10uM, 30uM, etc.) I create my dataset to include the images as the features and the concentrations as labels. Note that the concentrations are left as a string. When…
Zelreedy
- 3
- 2
0
votes
0 answers
How does cross-entropy loss change with the number of classes?
How does the value of the cross-entropy loss function vary with the number of classes being predicted?
Formally, if the loss function is
$$
L = - \sum_{x \in X} P^*(x) \log P(x)
$$
where $P^*(\cdot)$ is the true distribution, $P(\cdot)$ is the…
sdg
- 1
0
votes
0 answers
How backward() is calculated in CrossEntropyLoss?
I have a simple Linear model and I need to calculate the loss for it. I applied two CrossEntropyLoss and NLLLoss but I want to understand how grads are calculated on these both methods.
On the output layer, I have 4 neurons which mean I am going to…
Mahdi Amrollahi
- 263
- 2
- 10
0
votes
0 answers
Can we use MSELoss and CrossEntropyLoss alongside?
Can we apply both MSELoss and CrossEntropyLoss in a single network to predict both classification and regression in Deep Learning? Suppose that we have 4 points(regression) and 5 classes(classification) to predict and we use both those loss…
Mahdi Amrollahi
- 263
- 2
- 10
0
votes
1 answer
How to understand a large result of torch.nn.NLLLoss() with correct predicts?
I'm learning the usage of torch.nn.NLLLoss() and torch.nn.LogSoftmax(), and I'm confused about the results of them.
For example:
lsm = torch.nn.LogSoftmax(dim=-1)
nll = nn.NLLLoss()
grnd_truth = torch.tensor([1])
# let's say it predicted…
EvilRoach
- 101
- 2