Questions tagged [bagging]

20 questions
6
votes
1 answer

Boosting with highly correlated features

I have a conceptual question. My understanding is, that Random Forest can be applied even when features are (highly) correlated. This is because with bagging, the influence of few highly correlated features is moderated, since each feature only…
Peter
  • 7,277
  • 5
  • 18
  • 47
5
votes
1 answer

Can Boosting and Bagging be applied to heterogeneous algorithms?

Stacking can be achieved with heterogeneous algorithms such as RF, SVM and KNN. However, can such heterogeneously be achieved in Bagging or Boosting? For example, in Boosting, instead of using RF in all the iterations, could we use different…
Ahmad Bilal
  • 177
  • 5
3
votes
0 answers

Difference Bagging and Bootstrap aggregating

Bootstrap belongs to Efron. Tibshirani wrote a book about that in reference to Efron. Bootstrap process for estimating the standard error of statistic s(x). B bootstrap sample are generatied from original data. Finally the standard deviation of the…
martin
  • 329
  • 3
  • 12
3
votes
1 answer

Bagging vs pasting in ensemble learning

This is a citation from "Hands-on machine learning with Scikit-Learn, Keras and TensorFlow" by Aurelien Geron: "Bootstrapping introduces a bit more diversity in the subsets that each predictor is trained on, so bagging ends up with a slightly higher…
chekhovana
  • 31
  • 2
2
votes
1 answer

bagging vs. pasting in ensemble learning

I am bit confused about two concepts. From my understanding Bagging is when each data is replaced after each choice. so for example for each subset of data you pick one from population, replace it then pick one again, etc... and this is repeated for…
haneulkim
  • 385
  • 2
  • 11
2
votes
1 answer

Counting the number of trainable parameters in a gradient boosted tree

I recently ran the gradient boosted tree regressor using scikit-learn via: GradientBoostingRegressor() This model depends on the following hyperparameters: Estimators ($N_1$) Min Samples Leaf ($N_2$) Max Depth ($N_3$) which in-turn determine the…
ABIM
  • 123
  • 4
2
votes
1 answer

Can I do bagging method as improvement technique to decision tree in research?

Bagging use decision tree as base classifier. I want to use bagging with decision tree(c4.5) as base as the method that improve decision tree(c4.5) in my research that solve problem overfitting. Is that possible because some lecturers said not right…
2
votes
1 answer

How does bagging help reduce the variance

I learned that bagging helps reduce variance by averaging but I couldn't understand this. Can someone explain this intuitively?
Bhuwan Bhatt
  • 121
  • 3
2
votes
1 answer

Random Forest Stacking Experiment for Imbalanced Data-set Problem

In order to solve a Imbalanced Dataset Problem, I experimented with Random Forest in the given manner (Somewhat inspired by Deep-Learning) Trained a Random Forest which will take in the input data and the predict probability of the label of the…
Aman Raparia
  • 257
  • 2
  • 8
1
vote
1 answer

Why can't we sample without replacement for each tree in a random forest if the subsample size is large enough?

Usually if we have $n$ observations, for each tree with form a bootstrapped subsample of size $n$ with replacement. On googling it one common explanation I've seen is that with replacement sampling is necessary for independence of individual…
user9343456
  • 157
  • 8
1
vote
1 answer

Base model in ensemble learning

I've been doing some research on ensemble learning and read that for base models, model with high variance are often recommended (can't remember which book I read this from exactly). But, it seems counter-intuitive because wouldn't having base…
haneulkim
  • 385
  • 2
  • 11
1
vote
1 answer

Why the accuracy of my bagging model heavily affected by random state?

The accuracy of my bagging decision tree model reach up to 97% when I set the random seed=5 but the accuracy reduce to only 92% when I set random seed=0. Can someone explain why the huge gap and should I just use the accuracy with highest value in…
1
vote
0 answers

Can bagging ensemble consist of heterogeneous base models?

Bagging or bootstrap aggregation seems to make sense for time series forecasting using an ensemble because bagging randomizes subsets of the data with replacement. However, I've only seen bagging used for homogeneous base learners when constructing…
develarist
  • 214
  • 1
  • 10
1
vote
1 answer

Can the product of tree regressions be represented by a single tree?

Assume that we have two separate tree regressions. I'm interested in understanding whether the product of tree regressions can be represented by a single tree. Would this be possible?
1
vote
1 answer

What are valid measures for reporting k-fold score in the case of confusion-matrix?

I know when model is made to predict a float value, a common approach to report the models validation is using k-fold technique and calculating the average of all folds accuracy (here is a similar question). Now suppose that my model is a classifier…
morteza
  • 11
  • 2
1
2