# `%in%` and `%notin%` operators in R ### Introduction

• `%in%` is a built-in infix operator, which is similar to value matching function `match`.
• `%in%` returns logical vector (TRUE or FALSE but never NA) if there is a match or not for its left operand. Output logical vector has the same length as left operand.
• If there are two vector `x` and `y` then the syntax of `%in%`: `x %in% y`
• `%in%` works only with vectors
• `%notin%` is not a built-in operator and can be created by negating the `%in%` operator (see below)
• Help syntax for `%in%` operator: `?"%in%"`

### `%in%` to check the value in a vector

`%in%` helpful to check any value in a vector and returns TRUE or FALSE

``````x <- c(1,5,10,20,20,24,45)

# check any number in x vector
20 %in% x
 TRUE

# check vector in vector
#
y <- c(5,45)
y %in% x
 TRUE TRUE

# check sequence of numbers in other sequence (find any overlapping numbers)
1:5 %in% 4:7
 FALSE FALSE FALSE  TRUE  TRUE

# find common numbers (intersection)
x[x %in% y]  # similar to intersect(x,y)
  5 45

# check characters presents in other sequence of characters (find any overlapping characters)
LETTERS[1:5] %in% LETTERS[4:7]
 FALSE FALSE FALSE  TRUE  TRUE
``````

Run the code in colab

If you have big vector (say vector with 1000 values), you can use `any` or `all` functions with `%in%` operator

``````x <- 1:1000
y <- 900:2000

# check if there is any common values between a and b vectors
any(x %in% y)
 TRUE

# check if there are all values common between a and b vectors
all(x %in% y)
 FALSE
``````

Run the code in colab

### `%in%` to check the value in a Data Frames

Create a Data Frame

``````df <- data.frame(col1 = c("A", "B", "C"),
col2 = c(1, 2, 3),
col3 = c(0.1, 0.2, 0.3))
# output
col1 col2 col3
1    A    1  0.1
2    B    2  0.2
3    C    3  0.3
``````

Check if any value present in Data Frame columns

``````'B' %in% df\$col1
 TRUE

# to check if any col1 value is B
df\$col1  %in% 'B'
 FALSE  TRUE FALSE
``````

Run the code in colab

Check vector in a Data Frame and update Data Frame values,

``````# check vector values in a Data Frame
lapply(df, `%in%`, c(1, 4, 0.1))
# output
\$col1
 FALSE FALSE FALSE

\$col2
  TRUE FALSE FALSE

\$col3
  TRUE FALSE FALSE

# find and replace with 0
df[sapply(df, `%in%`, c(1, 4, 0.1))] <- 0
df
# output
col1 col2 col3
1    A    0  0.0
2    B    2  0.2
3    C    3  0.3
``````

Run the code in colab

### `%in%` to filter (subset) Data Frames based on multiple values

Filter (subset) Data Frame where multiple values match to col1,

``````library(dplyr)
df  %>% filter(col1 %in% c('A', 'B'))  # same as df[df\$col1 %in% c('A', 'B'),]
# output
col1 col2 col3
1    A    1  0.1
2    B    2  0.2

``````

### Comparison of `%in%` and `==` operators

• `==` operator compares the value between two vectors element-wise (the first value of one vector compared with the first value of another vector), whereas `%in%` compares the value between two vectors one by all (the first value of the first vector compared with all values of the second vector)
• With `==` operator, the length of the left and right operands must be the same. It is not necessary to have the same length for left and right operands for `%in%` operator.
``````x <- c(1, 2, 3)
y <- c(3, 2, 1)

x %in% y
 TRUE TRUE TRUE

x == y
 FALSE  TRUE FALSE
``````

Run the code in colab

### Create `%notin%` operator

`%notin%` operator is not built-in and can be created by applying `Negate` function to `%in%`

You can also use `%notin%` as by putting `!` in front of the `%in%` expression

```````%notin%` <- Negate(`%in%`)
``````

Check the value in a vector using `%notin%`

``````x <- c(1,5,10,20,20,24,45)

# check any number in x vector
50 %notin%  x # !(50 %in%  x)
 TRUE
``````

Update values of Data Frame to NA where values does not match,

``````# create data frame
df <- data.frame(col1 = c("A", "B", "C"),
col2 = c(1, 2, 3),
col3 = c(0.1, 0.2, 0.3))

# update values of Data Frame to NA where values does not match to c(2, 3)
df[sapply(df, `%notin%`, c(2, 3))] <- NA  # df[sapply(!(df, `%in%`, c(2, 3)))] <- NA
df
# output
col1 col2 col3
1 <NA>   NA   NA
2 <NA>    2   NA
3 <NA>    3   NA
``````

Run the code in colab