Let's assume I have 3 different climate models for a specific region that project the temperature. I also have the observations of that region for the same time frame(the real temperatures). My readings are daily. Which metric should I use and why? For example :
Mean Average is not really useful since because of the sign readings can cancel each other out Absolute Mean Average. I mostly think this is the most useful. However, I am puzzled because when I subtracted the projections - observations I noticed that around 500 readings out of the 13500 have a big difference (around 10 degrees of Celcius). Should I include these outliers or just delete them? Does Squared Mean Error and Root Squared Mean Error provide me with any valuable insight? EDIT : Hey everyone. Thank you for the feedback. My question at hand is that I am given 4 different bias correction methods and I want to create a multi model with equal voting. I want to present which bias correction method is more suitable: quantile mapping and scaled distribution mapping (Gamma and Normal Corrections). In order to do that(and to show that bias correction is useful) I found the Mean Absolute Error, Squared Mean Error and Root Squared Mean Error to find the most accurate model. Then performed bias correction with all methods and used the same metrics again. Also I created the multi model(basically it s the average of the models). Thus I created a plot with a single model, a multi model and a multi model with bias correction. With these metrics, I saw that the most accurate was the scaled distribution mapping with normal corrections which makes sense since temperature follows a normal distribution over the years (basically it has the lowest Mean Absolute Error) I based the comparison on MAE because I read that we usually use it to compare models etc
Or should I just plot the outcomes of each bias correction method and see which one follows the distribution better?