Solved: duplicate ‘row.names’ are not allowed error while using read.csv()

Renesh Bedre    2 minute read

When you read a CSV file using the read.csv() base R function, you may encounter error that duplicate 'row.names' are not allowed while using row.names parameter. This error occurs when there are duplicated values in a column specified by row.names parameter. The duplicated row names are not allowed in the R data frame.

For example, if you have the following dataset in a CSV format,

name,height,weight
x,1.80,65
y,1.62,67
z,1.55,62
z,1.56,63

In this dataset, one of the values in the name column is duplicated. If you import this dataset using the read.csv() function with the name column specified as row.names, you will get a duplicate 'row.names' are not allowed error.

Let’s replicate the error using the above dataset,

df = read.csv('https://reneshbedre.github.io/assets/posts/other/data.csv', row.names = "name")

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed

There are two solutions to fix duplicated row name error as outlined below.

Solution 1: Import without row.names parameter

You can import a CSV file without specifying a row.names or assign row.names = NULL. This will assign numerical values to the row names.

df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") 
# same as df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv", row.names = NULL) 
# view data frame
df
# output
  name height weight
1    x   1.80     65
2    y   1.62     67
3    z   1.55     62
4    z   1.56     63

Now, if you want to set the first column (name) as row names, you can try to make values in the name column as unique values using make.names() function.

# make values in name column unique
uniq_name = make.names(df$name, unique = TRUE)
row.names(df) = uniq_name
# view data frame
df
# output
   name height weight
x      x   1.80     65
y      y   1.62     67
z      z   1.55     62
z.1    z   1.56     63

You have created a data frame with unique row names. If you would like you can drop (df[,-1]) or keep the name column in the data frame.

Solution 2: Create a matrix

In R, data frame does not allow to have duplicated rows, but the matrix can have the duplicated rows.

# load dataset
df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv") 
# view data frame
df
# output
  name height weight
1    x   1.80     65
2    y   1.62     67
3    z   1.55     62
4    z   1.56     63

Create a matrix from data frame using data.matrix() function,

df_mat = data.matrix(df)
row.names(df_mat) = df$name
# drop first name column
df_mat = df_mat[ ,-1]
# view matrix
df_mat
# output
  height weight
x   1.80     65
y   1.62     67
z   1.55     62
z   1.56     63

Enhance your skills with courses on Statistics and R

If you enhanced your knowledge and practical skills from this article, consider supporting me on

Buy Me A Coffee

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.

Tags:

Updated: