R - NA (Not Available)

> Procedural Languages > R

1 - About

Data Mining - (Missing Value|Not Available) in R.

NA (Not Available|Missing Values) is a logical constant.

See

?NA
> str(NA)
 logi NA

NA values have a class. There are integer NA, character NA, etc.

NA means “Not Available”.

NA is a logical constant of length 1 which contains a missing value indicator.

Advertising

3 - Function

3.1 - is.na

  • Is NA NA ?
> is.na(NA)
[1] TRUE
  • Where is NA ?
> v =  c(1, NA, 2)
> is.na(v)
[1] FALSE  TRUE FALSE
> is.na(NaN)
[1] TRUE

4 - Management

4.1 - analytics

vapply(data.frame, function(x) mean(!is.na(x)), numeric(1))

4.2 - remove

4.2.1 - vector

v =  c(1, NA, 2)
> v[!is.na(v)]
[1] 1 2

4.2.2 - matrix

colA = c(1, 2, 3, 4, NA, 6)
colB = c(1, 2, NA, 4, 5, 6)
m=cbind(colA,colB)
m
     colA colB
[1,]    1    1
[2,]    2    2
[3,]    3   NA
[4,]    4    4
[5,]   NA    5
[6,]    6    6
rowWithoutNa = complete.cases(m)
rowWithoutNa
[1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE
m[rowWithoutNa,]
     colA colB
[1,]    1    1
[2,]    2    2
[3,]    4    4
[4,]    6    6
Advertising

4.3 - eliminate the row

to eliminate any row in the data frame that's got a missing value in it, there's a nice function called na.omit

myDataFrame=na.omit(myDataFrame)

4.4 - Count

in a variable

with(dataFrame,sum(is.na(myVariableName)))

4.5 - Create them

myNaVector=rep(NA,10)
lang/r/na.txt · Last modified: 2018/05/13 19:25 by gerardnico