I have a dataset with 5K records for binary classification problem.
My features are min_blood_pressure, max_blood_pressure, min_heart_rate, max_heart_rate etc. Similarly, I have more than 15 measurements and each of them have min and max columns amounting to 30 variables.
When I ran correlation on the data, I was able to see that these input features are highly correlated. I mean min_blood_pressure is highly correlated (>80%) to max_blood_pressure. Each measurement with its min and max feature is highly correlated. Though their individual correlation to target variable is less.
So in this case, which one should I drop or how should I handle this scenario?
I guess there is min and max variables for a reason. How would you do in a situation like this?
Should we find the average of all the measurements and create a new feature?
Can anyone help me with this?