Solved: duplicate ‘row.names’ are not allowed error while using
When you read a CSV file using the
read.csv() base R function, you may encounter error that
duplicate 'row.names' are not allowed while using
row.names parameter. This error occurs when there are duplicated
values in a column specified by
row.names parameter. The duplicated row names are not allowed in the R data frame.
For example, if you have the following dataset in a CSV format,
name,height,weight x,1.80,65 y,1.62,67 z,1.55,62 z,1.56,63
In this dataset, one of the values in the
name column is duplicated. If you import this dataset using the
function with the
name column specified as
row.names, you will get a
duplicate 'row.names' are not allowed error.
Let’s replicate the error using the above dataset,
df = read.csv('https://reneshbedre.github.io/assets/posts/other/data.csv', row.names = "name") Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed
There are two solutions to fix duplicated row name error as outlined below.
Solution 1: Import without
You can import a CSV file without specifying a
row.names or assign
row.names = NULL. This will assign numerical values
to the row names.
df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") # same as df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv", row.names = NULL) # view data frame df # output name height weight 1 x 1.80 65 2 y 1.62 67 3 z 1.55 62 4 z 1.56 63
Now, if you want to set the first column (
name) as row names, you can try to make values in the
name column as
unique values using
# make values in name column unique uniq_name = make.names(df$name, unique = TRUE) row.names(df) = uniq_name # view data frame df # output name height weight x x 1.80 65 y y 1.62 67 z z 1.55 62 z.1 z 1.56 63
You have created a data frame with unique row names. If you would like you can drop (
df[,-1]) or keep the
column in the data frame.
Solution 2: Create a matrix
In R, data frame does not allow to have duplicated rows, but the matrix can have the duplicated rows.
# load dataset df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") # view data frame df # output name height weight 1 x 1.80 65 2 y 1.62 67 3 z 1.55 62 4 z 1.56 63
Create a matrix from data frame using
df_mat = data.matrix(df) row.names(df_mat) = df$name # drop first name column df_mat = df_mat[ ,-1] # view matrix df_mat # output height weight x 1.80 65 y 1.62 67 z 1.55 62 z 1.56 63
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.