Data Mining - (two class|binary) classification problem (yes/no, false/true)

Thomas Bayes

About

Binary classification is used to predict one of two possible outcomes.

A two class problem (binary problem) has possibly only two outcomes:

  • “yes or no”
  • “success” or “failure”

and is much more known as a Bernoulli trial (or binomial trial)

See

Example

  • Is this transaction a fraud ?
  • Will this prospect become a customer ?
  • Which employees are likely to leave a company in the next year
  • Is the top card of a shuffled deck an ace?
  • Was the newborn child a girl?
  • Rolling a die, where a six is “success” and everything else a “failure”.
  • In conducting a political opinion poll, choosing a voter at random to ascertain whether that voter will vote “yes” in an upcoming referendum.





Discover More
Binomial Distribution
(Probability|Statistics) - Binomial Distribution

The binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. The...
Classification
Data Mining - (Classifier|Classification Function)

A classifier is a Supervised function (machine learning tool) where the learned (target) attribute is categorical (“nominal”) in order to classify. It is used after the learning process to classify...
Thomas Bayes
Data Mining - (Class|Category|Label) Target

A class is the category for a classifier which is given by the target. The number of class to be predicted define the classification problem. A class is also known as a label. Labeled...
Logit Vs Probit
Data Mining - Probit Regression (probability on binary problem)

Probit_modelprobit model (probability + unit) is a type of regression where the dependent variable can only take two values. As the Probit function is really similar to the logit function, the probit...
Thomas Bayes
Data Mining - Problem

A page the problem definition in data Type of target: nominal or quantitative Type of target class: binomial of multiclass Number of parameters: Type of (predictor|features): nominal or numeric....
Thomas Bayes
Machine Learning - Area under the curve (AUC)

The Area under the curve (AUC) is a performance metrics for a binary classifiers. By comparing the ROC curves with the area under the curve, or AUC, it captures the extent to which the curve is up in the...
Linear Vs True Regression Function
Machine Learning - Linear (Regression|Model)

Linear regression is a regression method (ie mathematical technique for predicting numeric outcome) based on the resolution of linear equation. This is a classical statistical method dating back more...
Thomas Bayes
Statistics - (Confidence|likelihood) (Prediction probabilities|Probability classification)

Prediction probabilities are also known as: confidence (How confident can I be of this prediction?). or likelihood: (How likely is this prediction to be true?) They gives the probability of a predicted...
Thomas Bayes
Statistics - (Threshold|Cut-off) of binary classification

The Threshold or Cut-off represents in a binary classification the probability that the prediction is true. It represents the tradeoff between false positives and false negatives. Normally, the cut-off...
Thomas Bayes
Statistics - Binary logistic regression

logistic regression for a binary outcome. where: : predicted value on the outcome variable Y : the outcome variable : predicted value on Y when all X = 0 : predictor variables : unstandardized...



Share this page:
Follow us:
Task Runner