Statistics - Sample (Variable | Attribute | Feature)

About

A (statistician|data miner) studying a population would be interested in collecting information about different characteristics of the subject (like their length, or weight, or age) in a sample. Those characteristics are called variables.

In a database, they are just columns.

They can take on multiple values. In contrast, a constant has only one value

A variable have:

a data_type
and a usage

In mathematics, variables are listed among the arguments that the function takes.

Articles Related

Data Type

Variables (of an instance) are of two types:

discrete (called nominal, categorical or qualitative)
or continuous (called numeric, numerical or quantitative).

and have 4 levels

Categorical

When a characteristic can be neatly placed into well-defined groups, or categories that do not depend on order, it is called a categorical variable (some statisticians use the word qualitative).

Numerical

When we are interested in the total number of each species of tortoise, or how many individuals there are per square kilometre. This type of variable is called numerical (or quantitative).

See Number - Collection

Usage

Type of attribute	Type of model	Description
independent variable (predictors\|feature)	supervised	Predictors that affect a given outcome
dependent variable (outcome,…)	supervised	outcome that are affected by predictors
descriptors	(unsupervised\|descriptive)	Items of information being analysed for natural groupings or associations.

Variable Name Glossary

Dependent variable	Independent variable
Dependent	Independent
Outcome	Predictor
…

Others

Missing Value

see Data Mining - (Missing Value|Not Available) NA

Case id

A Case Id identifies uniquely each record in order to help with model repeatability.