When you read a CSV file using the
read.csv() base R function, you may encounter error
duplicate 'row.names' are
not allowed while using
The R data frame does not allow duplicated values in a column specified by
This error occurs when there are duplicated values in a column specified by
For example, if you have the following dataset in a CSV format,
name,height,weight x,1.80,65 y,1.62,67 z,1.55,62 z,1.56,63
In this dataset, one of the values in the
name column is duplicated. If you import this dataset using the
function with the
name column specified as
row.names, you will get a
duplicate 'row.names' are not allowed error.
Let’s replicate the error using the above dataset,
df = read.csv('https://reneshbedre.github.io/assets/posts/other/data.csv', row.names = "name") Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed
To fix duplicate
row.names are not allowed error, you can use following two solutions.
Solution 1: Import without
In this solution, you can fix this error by importing a CSV file without specifying a
row.names or assign
row.names = NULL. This will assign numerical values to the row names.
df <- read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") # same as df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv", row.names = NULL) # view data frame df name height weight 1 x 1.80 65 2 y 1.62 67 3 z 1.55 62 4 z 1.56 63
Now, if you want to set the first column (
name) as row names, you can try to make values in the
name column as
unique values using
# make values in name column unique uniq_name <- make.names(df$name, unique = TRUE) row.names(df) <- uniq_name # view data frame df name height weight x x 1.80 65 y y 1.62 67 z z 1.55 62 z.1 z 1.56 63
You have created a data frame with unique row names. If you would like you can drop (
df[,-1]) or keep the
column in the data frame.
Solution 2: Create a matrix
In this solution, you can fix this error by creating a matrix from a data frame.
In R, data frame does not allow to have duplicated rows, but the matrix can have the duplicated rows.
# load dataset df <- read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") # view data frame df name height weight 1 x 1.80 65 2 y 1.62 67 3 z 1.55 62 4 z 1.56 63
Create a matrix from data frame using
# convert data frame to matrix df_mat <- data.matrix(df) # assign row name to matrix row.names(df_mat) <- df$name # drop first name column df_mat = df_mat[ ,-1] # view matrix df_mat height weight x 1.80 65 y 1.62 67 z 1.55 62 z 1.56 63
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.