Let's say I have a dataset of the occurrence of pregnancies each time is tried, the ground truth of success to failure rate is 30:70. But the dataset with me now is a 70:30 dataset. How would that be an issue when modelling?
I understand that most modern algorithms can process disproportionate data well. But will the misrepresented data affect the outcome of the model?
I've tried looking for papers and articles and it doesn't seem to answer my question. Links to papers and websites would be appreciated!