R - Subset Operators (Extract or Replace Parts of an Object)

Card Puncher Data Processing

About

In the Data Manipulation category, you have the subset operators:

  • [
  • [[
  • $

This operators acts on:

to extract or replace parts.

See also: the subset function

Usage

# Extraction
x[columnsSelector]
x[rowsSelector, columnsSelector, drop = ]
x[[rowsSelector, columnsSelector]] # shortcut can be used only to select one element (column, row) with drop = true 
x$name

# Replace
x[rowsSelector, columnsSelector] <- value

where:

  • x is an object (data frame, …)
  • rowsSelector: (optional) rows selector by index or name to extract or replace. Default all rows.
  • columnsSelector: columns selector by index or name to extract or replace.
2:3 # column 2 to 3
c("colA", "colC")
  • drop: (optional) logical. If TRUE the result is coerced to the lowest possible dimension. The default is to drop if only one column is left (the column becomes a list), but not to drop if only one row is left (the row stays in a data frame).
  • $ is used to extract elements of a list or data frame by name.
  • [ always returns an object of the same class as the original. It can be used to select more than one element (there is one exception)
  • [[ is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame

Demo Data

We will subset the following data frame

> df = data.frame(colA=1:5,colB=2:6,colC=3:7,row.names=letters[1:5])
> df
colA colB colC
a    1    2    3
b    2    3    4
c    3    4    5
d    4    5    6
e    5    6    7

Example for a list are available here

Subset Type

Column

One Column

  • Extract the column by name:
df$colA
[1] 1 2 3 4 5

  • Retrieve the column by index number and return a vector. The second:
df[,2]  # return an vector integer because the default is to have drop = true for a column (not for a row)
df[,2, drop = TRUE] # same 
[1] 2 3 4 5 6

  • Retrieve the column by index number and return a data frame. The second:
df[2]
df[,2,drop = FALSE] # same
colB
a    2
b    3
c    4
d    5
e    6

  • Retrieve the column by naming index.
df[,"colB"]
[1] 2 3 4 5 6

  • Remove the second column
df[-2]
colA colC
a    1    3
b    2    4
c    3    5
d    4    6
e    5    7

Multiple Columns

  • Retrieving columns by indexing
# By naming
df[,c("colA","colB")]
# or
df[c("colA","colB")]
# By Indexing
df[,c(1,2)]
# or
df[c(1,2)]
colA colB
a    1    2
b    2    3
c    3    4
d    4    5
e    5    6

  • Retrieve columns by range
df[2:3]
# or
df[,2:3]
colB colC
a    2    3
b    3    4
c    4    5
d    5    6
e    6    7

  • Removing Multiple Columns
df[-c(1,2)]
colC
a    3
b    4
c    5
d    6
e    7

Row

One Row

  • Retrieve the row by index and dropping the dimension. The fourth
df[4,]
colA colB colC
d    4    5    6

  • Retrieve the row by index without dropping the dimension. The fourth
df[4,,drop=FALSE]
colA colB colC
d    4    5    6

  • Retrieve the row by index naming
df["d",]
colA colB colC
d    4    5    6

Multiple Rows Indexing

  • Retrieve two rows by naming
df[c("b","e"),]
colA colB colC
b    2    3    4
e    5    6    7

  • Retrieve the first and third rows by logical vector
df[c(TRUE,FALSE,TRUE,FALSE,FALSE),]
colA colB colC
a    1    2    3
c    3    4    5

Multiple Rows Filtering

  • Retrieve the rows by logical vector where colB > 3
df[df$colB>3,]
# of
df[df[,"colB"]>3,]
# of
df[df[,2]>3,]
colA colB colC
c    3    4    5
d    4    5    6
e    5    6    7

  • Retrieve the rows by logical vector where colB > 3 and colC⇐6
df[df$colB>3&df$colC<=6,]
colA colB colC
c    3    4    5
d    4    5    6

Vertical and Horizontal

  • Extracting the intersection of rows and columns
df[2:4,2:3]
colB colC
b    3    4
c    4    5
d    5    6

  • Update the total time column where the succes flag is 3 and the error text contains 46006
res[res$SUCCESS_FLG==3 & grepl("46066",res$ERROR_TEXT),c("TOTAL_TIME_SEC")] <-200

Documentation / Reference

?":"
?"["
?"$"
?"[["
?"[.data.frame"





Discover More
Card Puncher Data Processing
R - (Mathematical|Logical) Operators

& boolean and | boolean for xor exactly or ! not any any true all all true Not (!) Many operations in R are vectorized. < Less than > Greater than == Equal to <= Less than or equal...
Card Puncher Data Processing
R - Data Manipulation

subset function, subset operators the package dplyr
R Studio Import Dataset
R - Data frame Object

A data frame is a logical implementation of a table in a relational database A data frame inherits all the property and function of an object. It has a list of variables of the same number of rows with...
Card Puncher Data Processing
R - List

The list in R may contain elements of the different class (just like a data frame) class a vector (1 dimension) or a matrix (2 dimensions) data.tables and data.frames are internally lists with all...
Card Puncher Data Processing
R - Matrix

The matrix object in R is an array of two dimensions with the same class (data type). You can create a matrix with three methods: the matrix method the columns and rows bindings methods the...
Card Puncher Data Processing
R - Subset

This function return subsets of vectors, matrices or data frames which meet conditions. Alternatively, you can use the subset operators



Share this page:
Follow us:
Task Runner