How to Use bedtools merge

Renesh Bedre    2 minute read

bedtools merge is a command-line utility for combining the overlapping or adjacent intervals into a single interval in a BED file.

The general syntax of bedtools merge looks like this:

# default: combine overlapping intervals
bedtools merge -i file.bed 

# combine adjacent intervals
bedtools merge -i file.bed 500

By default, the bedtools merge combines overlapping intervals from BED file if there is at least 1 bp of overlap.

In addition to the above parameters, the bedtools merge has several other parameters

Learn how to install bedtools

The following examples demonstrate how to use bedtools merge for combining the intervals in BED file.

Note: The input BED file must be sorted (by chromosome and start position) before using the bedtools merge. The BED file can be sorted using the sort -k1,1 -k2,2 file.bed > sorted.bed

Example 1: Combine overlapping intervals

The following example shows how to use bedtools merge to combine overlapping intervals in a BED file (default behavior).

head file.bed
Chr1    3996    4276
Chr1    4200    4700
Chr1    5039    5630

# combine overlapping interval
bedtools merge -i file.bed

# output
Chr1    3996    4700
Chr1    5039    5630

If you have multiple BED files, you can concatenate all BED files into one file and perform bedtools merge on the concatenated file.

Example 2: Combine adjacent intervals

The following example shows how to use bedtools merge to combine adjacent intervals in a BED file. You need to provide the distance between the two intervals using the -d parameter.

head fileA.bed
Chr1    3996    4276
Chr1    4486    4600
Chr1    5439    5630

# combine adjacent interval
bedtools merge -i file.bed -d 500

# output
Chr1    3996    5630

Example 3: Combine overlapping intervals from specific strands

bedtools merge with the -S parameter can be used for combining intervals from specific strands in a BED file. The BED file should have strand information in the sixth column.

head file.bed
Chr1    1039    2630    name3   3       +
Chr1    4200    4700    name2   2       +
Chr1    4996    5276    name1   1       +
Chr1    7000    7500    name4   4       -
Chr1    7200    7800    name5   5       -

# combine overlapping intervals from + strand
bedtools merge -i sorted.bed  -S +

# output
Chr1    1039    2630
Chr1    4200    4700
Chr1    4996    5276

Enhance your skills with courses on genomics and bioinformatics

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.