To use pre-trained models it is a preferred practice to normalize the input images with imagenet standards.
mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225].
How are these parameters derived?
To use pre-trained models it is a preferred practice to normalize the input images with imagenet standards.
mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225].
How are these parameters derived?
According to the Pytorch's docs, you can calculate mean and std using this:
import torch
from torchvision import datasets, transforms as T
transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)
means = []
stds = []
for img in subset(dataset):
means.append(torch.mean(img))
stds.append(torch.std(img))
mean = torch.mean(torch.tensor(means))
std = torch.mean(torch.tensor(stds))