I want to create a model to predict the propensity to buy a certain product. As my proportion of 1's is very low, I decided to apply oversampling (to get a 10% of 1's and a 90% of 0's).
Now, I want to discretize some of the variables. To do so I run a tree for each variable against the target.
Should I define the prior probabilities when I do this (run the trees), or it doesn't matter and I can use the over-sampled dataset just like that?