Statistics - Resampling through Random Percentage Split

1 - About

Percentage Split (Fixed or Holdout) is a re-sampling method that leave out random N% of the original data.

For example, you might select:

The algorithm is trained against the trained data and the accuracy is calculated on the test data set.

3 - Standard Deviation in Validation

When percentage split with a random method is repeated for validation, there is a good chance of overlap between the different test sets. The algorithm has already (learn|see) them. With cross-validation, this overlap doesn't occur. This is why the standard deviation estimate tends to be smaller for percentage split than for cross-validation.

data_mining/validation_set.txt · Last modified: 2015/07/03 22:51 by gerardnico