1

All I can find online is how a random forest is run on all variables in a dataset using the period:

RF <- randomForest(sale ~ ., data = TrainSet, importance = TRUE)

What if I only want to apply the RF to selected variables in the dataset? Would I need to drop the variables first?

I tried the following, but get an error: RF <- randomForest(sale ~ v1,v2,v3, data = TrainSet, importance = TRUE)

kangaroo_cliff
  • 362
  • 1
  • 9
ColRow
  • 53
  • 7

1 Answers1

1

You have to do

RF <- randomForest(sale ~ v1 + v2 + v3, data = TrainSet, importance = TRUE)

this is the formula notation for R. It doesn't make a lot of sense for random forest models, but it is how it works.

David Masip
  • 5,981
  • 2
  • 23
  • 61