Solved: duplicate ‘row.names’ are not allowed error while using read.csv()
When you read a CSV file using the read.csv()
base R function, you may encounter error that
duplicate 'row.names' are not allowed
while using row.names
parameter. This error occurs when there are duplicated
values in a column specified by row.names
parameter. The duplicated row names are not allowed in the R data frame.
For example, if you have the following dataset in a CSV format,
name,height,weight
x,1.80,65
y,1.62,67
z,1.55,62
z,1.56,63
In this dataset, one of the values in the name
column is duplicated. If you import this dataset using the read.csv()
function with the name
column specified as row.names
, you will get a duplicate 'row.names' are not allowed
error.
Let’s replicate the error using the above dataset,
df = read.csv('https://reneshbedre.github.io/assets/posts/other/data.csv', row.names = "name")
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed
There are two solutions to fix duplicated row name error as outlined below.
Solution 1: Import without row.names
parameter
You can import a CSV file without specifying a row.names
or assign row.names = NULL
. This will assign numerical values
to the row names.
df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv")
# same as df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv", row.names = NULL)
# view data frame
df
# output
name height weight
1 x 1.80 65
2 y 1.62 67
3 z 1.55 62
4 z 1.56 63
Now, if you want to set the first column (name
) as row names, you can try to make values in the name
column as
unique values using make.names()
function.
# make values in name column unique
uniq_name = make.names(df$name, unique = TRUE)
row.names(df) = uniq_name
# view data frame
df
# output
name height weight
x x 1.80 65
y y 1.62 67
z z 1.55 62
z.1 z 1.56 63
You have created a data frame with unique row names. If you would like you can drop (df[,-1]
) or keep the name
column in the data frame.
Solution 2: Create a matrix
In R, data frame does not allow to have duplicated rows, but the matrix can have the duplicated rows.
# load dataset
df = read.csv("https://reneshbedre.github.io/assets/posts/other/data.csv")
# view data frame
df
# output
name height weight
1 x 1.80 65
2 y 1.62 67
3 z 1.55 62
4 z 1.56 63
Create a matrix from data frame using data.matrix()
function,
df_mat = data.matrix(df)
row.names(df_mat) = df$name
# drop first name column
df_mat = df_mat[ ,-1]
# view matrix
df_mat
# output
height weight
x 1.80 65
y 1.62 67
z 1.55 62
z 1.56 63
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.