%in% and %notin% operators in R

Renesh Bedre    4 minute read

%in% operator in R

Page content

Introduction

  • %in% is a built-in infix operator, which is similar to value matching function match.
  • %in% returns logical vector (TRUE or FALSE but never NA) if there is a match or not for its left operand. Output logical vector has the same length as left operand.
  • If there are two vector x and y then the syntax of %in%: x %in% y
  • %in% works only with vectors
  • %notin% is not a built-in operator and can be created by negating the %in% operator (see below)
  • Help syntax for %in% operator: ?"%in%"

%in% to check the value in a vector

%in% helpful to check any value in a vector and returns TRUE or FALSE

x <- c(1,5,10,20,20,24,45)

# check any number in x vector
20 %in% x
[1] TRUE

# check vector in vector
# 
y <- c(5,45)
y %in% x
[1] TRUE TRUE

# check sequence of numbers in other sequence (find any overlapping numbers)
1:5 %in% 4:7
[1] FALSE FALSE FALSE  TRUE  TRUE

# find common numbers (intersection)
x[x %in% y]  # similar to intersect(x,y)
[1]  5 45

# check characters presents in other sequence of characters (find any overlapping characters)
LETTERS[1:5] %in% LETTERS[4:7]
[1] FALSE FALSE FALSE  TRUE  TRUE

Run the code in colab

If you have big vector (say vector with 1000 values), you can use any or all functions with %in% operator

x <- 1:1000
y <- 900:2000

# check if there is any common values between a and b vectors
any(x %in% y)
[1] TRUE

# check if there are all values common between a and b vectors
all(x %in% y)
[1] FALSE

Run the code in colab

%in% to check the value in a Data Frames

Create a Data Frame

df <- data.frame(col1 = c("A", "B", "C"),
  col2 = c(1, 2, 3),
  col3 = c(0.1, 0.2, 0.3))
# output
  col1 col2 col3
1    A    1  0.1
2    B    2  0.2
3    C    3  0.3

Check if any value present in Data Frame columns

'B' %in% df$col1
[1] TRUE

# to check if any col1 value is B
df$col1  %in% 'B'
[1] FALSE  TRUE FALSE

Run the code in colab

Check vector in a Data Frame and update Data Frame values,

# check vector values in a Data Frame
lapply(df, `%in%`, c(1, 4, 0.1))
# output
$col1
[1] FALSE FALSE FALSE

$col2
[1]  TRUE FALSE FALSE

$col3
[1]  TRUE FALSE FALSE

# find and replace with 0
df[sapply(df, `%in%`, c(1, 4, 0.1))] <- 0
df
# output
  col1 col2 col3
1    A    0  0.0
2    B    2  0.2
3    C    3  0.3

Run the code in colab

%in% to filter (subset) Data Frames based on multiple values

Filter (subset) Data Frame where multiple values match to col1,

library(dplyr)
df  %>% filter(col1 %in% c('A', 'B'))  # same as df[df$col1 %in% c('A', 'B'),]
# output
  col1 col2 col3
1    A    1  0.1
2    B    2  0.2

Comparison of %in% and == operators

  • == operator compares the value between two vectors element-wise (the first value of one vector compared with the first value of another vector), whereas %in% compares the value between two vectors one by all (the first value of the first vector compared with all values of the second vector)
  • With == operator, the length of the left and right operands must be the same. It is not necessary to have the same length for left and right operands for %in% operator.
x <- c(1, 2, 3)
y <- c(3, 2, 1)

x %in% y
[1] TRUE TRUE TRUE

x == y
[1] FALSE  TRUE FALSE

Run the code in colab

Create %notin% operator

%notin% operator is not built-in and can be created by applying Negate function to %in%

You can also use %notin% as by putting ! in front of the %in% expression

`%notin%` <- Negate(`%in%`)

Check the value in a vector using %notin%

x <- c(1,5,10,20,20,24,45)

# check any number in x vector
50 %notin%  x # !(50 %in%  x) 
[1] TRUE

Update values of Data Frame to NA where values does not match,

# create data frame
df <- data.frame(col1 = c("A", "B", "C"),
  col2 = c(1, 2, 3),
  col3 = c(0.1, 0.2, 0.3))

# update values of Data Frame to NA where values does not match to c(2, 3)
df[sapply(df, `%notin%`, c(2, 3))] <- NA  # df[sapply(!(df, `%in%`, c(2, 3)))] <- NA
df
# output
  col1 col2 col3
1 <NA>   NA   NA
2 <NA>    2   NA
3 <NA>    3   NA

Run the code in colab

References

This work is licensed under a Creative Commons Attribution 4.0 International License

Tags:

Updated: