Data Mining - Random forest
Table of Contents
1 - About
Random forests are collections of trees, all slightly different.
It randomize the algorithm, not the training data. How you randomize depends on the algorithm, for c4.5: don’t pick the best, pick randomly from the k best options
It generally improves decision trees decisions.
A random forest is a meta estimator that fits a number of classifical decision trees on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting.
2 - Articles Related
3 - Interpretation
Diagnostics charts from random forests are much easier to understand than what comes from logistic regression
4 - Implementation
4.1 - Weka
trees>RandomForests (options: number of trees (default 10), maximum depth of trees, number of attributes)