Machine learning - Bootstrap aggregating (bagging)

Thomas Bayes

About

Bootstrap aggregating (bagging) is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.

It also reduces variance and helps to avoid over-fitting. Although it is usually applied to decision tree methods, it can be used with any type of method.

Bagging:

  1. produces several different training sets of the same size with replacement
  2. and then build a model for each one using the same machine learning scheme
  3. Combine predictions by voting for a nominal target or averaging for a numeric target

Bagging can be parallelized.

Advantage / Inconvenient

It's very suitable for “unstable” learning schemes which means that small change in training data can make big change in the model.

Example: decision trees is a very unstable schema but not Naïve Bayes or instance‐based learning because all attributes contributes independently

Replacement

In bagging, you sample the set “with replacement” which means that you might get in your sample two of the same instance.

Implementation

Weka

meta>Bagging

Documentation / Reference





Discover More
Data Mining Algorithm
Data Mining - Algorithms

An is a mathematical procedure for solving a specific kind of problem. For some data mining functions, you can choose among several algorithms. Algorithm Function Type Description Decision...
Thomas Bayes
Data Mining - Ensemble Learning (meta set)

Combining multiple models into ensemble in order to produce an ensemble for learning. (Committee| Collective Intelligence) decision of different classifier algorithms. Having different classifiers (also...



Share this page:
Follow us:
Task Runner