%in% and %notin% operators in R

Renesh Bedre    3 minute read

%in% operator in R

  • %in% is a built-in infix operator, which is similar to value matching function match.
  • %in% returns logical vector (TRUE or FALSE but never NA) if there is a match or not for its left operand. Output logical vector has the same length as left operand.
  • If there are two vector x and y then the syntax of %in%: x %in% y
  • %in% works only with vectors
  • %notin% is not a built-in operator and can be created by negating the %in% operator (see below)
  • Help syntax for %in% operator: ?"%in%"

%in% to check the value in a vector

%in% helpful to check any value in a vector and returns TRUE or FALSE

x <- c(1,5,10,20,20,24,45)

# check any number in x vector
20 %in% x
[1] TRUE

# check vector in vector
# 
y <- c(5,45)
y %in% x
[1] TRUE TRUE

# check sequence of numbers in other sequence (find any overlapping numbers)
1:5 %in% 4:7
[1] FALSE FALSE FALSE  TRUE  TRUE

# find common numbers (intersection)
x[x %in% y]  # similar to intersect(x,y)
[1]  5 45

# check characters presents in other sequence of characters (find any overlapping characters)
LETTERS[1:5] %in% LETTERS[4:7]
[1] FALSE FALSE FALSE  TRUE  TRUE

Run the code in colab

If you have big vector (say vector with 1000 values), you can use any or all functions with %in% operator

x <- 1:1000
y <- 900:2000

# check if there is any common values between a and b vectors
any(x %in% y)
[1] TRUE

# check if there are all values common between a and b vectors
all(x %in% y)
[1] FALSE

Run the code in colab

%in% to check the value in a Data Frames

Create a Data Frame

df <- data.frame(col1 = c("A", "B", "C"),
  col2 = c(1, 2, 3),
  col3 = c(0.1, 0.2, 0.3))
# output
  col1 col2 col3
1    A    1  0.1
2    B    2  0.2
3    C    3  0.3

Check if any value present in Data Frame columns

'B' %in% df$col1
[1] TRUE

# to check if any col1 value is B
df$col1  %in% 'B'
[1] FALSE  TRUE FALSE

Run the code in colab

Check vector in a Data Frame and update Data Frame values,

# check vector values in a Data Frame
lapply(df, `%in%`, c(1, 4, 0.1))
# output
$col1
[1] FALSE FALSE FALSE

$col2
[1]  TRUE FALSE FALSE

$col3
[1]  TRUE FALSE FALSE

# find and replace with 0
df[sapply(df, `%in%`, c(1, 4, 0.1))] <- 0
df
# output
  col1 col2 col3
1    A    0  0.0
2    B    2  0.2
3    C    3  0.3

Run the code in colab

Comparison of %in% and == operators

  • == operator compares the value between two vectors element-wise (the first value of one vector compared with the first value of another vector), whereas %in% compares the value between two vectors one by all (the first value of the first vector compared with all values of the second vector)
  • With == operator, the length of the left and right operands must be the same. It is not necessary to have the same length for left and right operands for %in% operator.
x <- c(1, 2, 3)
y <- c(3, 2, 1)

x %in% y
[1] TRUE TRUE TRUE

x == y
[1] FALSE  TRUE FALSE

Run the code in colab

Create %notin% operator

%notin% operator can be created by applying Negate function to ` %in%`

`%notin%` <- Negate(`%in%`)

Update values of Data Frame to NA where values does not match,

# create data frame
df <- data.frame(col1 = c("A", "B", "C"),
  col2 = c(1, 2, 3),
  col3 = c(0.1, 0.2, 0.3))

# update values of Data Frame to NA where values does not match to c(2, 3)
df[sapply(df, `%notin%`, c(2, 3))] <- NA
df
# output
  col1 col2 col3
1 <NA>   NA   NA
2 <NA>    2   NA
3 <NA>    3   NA

Run the code in colab

References

This work is licensed under a Creative Commons Attribution 4.0 International License

Tags:

Updated: