How to Use blastdbcmd to Extract Sequences from BLAST Database

Renesh Bedre    1 minute read

blastdbcmd is a command-line utility from the NCBI BLAST toolkit that allows extracting the sequences from a formatted BLAST database based on sequence identifiers.

The general syntax of blastdbcmd looks like this:

# extract specific sequences
blastdbcmd -db blast_db_name -entry seq_id  -out out.fasta

# extract all sequences
blastdbcmd -db blast_db_name -entry all -out out.fasta

Where,

Parameter Description
-db BLAST database name
-entry Comma-delimited sequence identfier to extract the sequences. Use “all” to extract all sequences from formatted BLAST database
-out Redirects the output to a file instead of printing to the console

Note: The BLAST database should be created with the -parse_seqids option for extracting the specific sequences from the formatted BLAST database.

The following examples explains how to use blastdbcmd to extract the sequences from formatted BLAST database.

Extract specific sequences from BLAST database

Extract the single sequence from sample_nucl.fasta BLAST database

# extract single sequence
blastdbcmd -db sample -entry seq1

Output:

>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS

Extract multiple sequences from sample_nucl.fasta BLAST database

# extract single sequence
blastdbcmd -db sample -entry seq1,seq2

Output:

>seq1
MERLNSKLYVENCYIMKENEKLRKKAELLNQENQQLLVQLKQKLSKANKNPNGSNNDNNVSSSSSASGKS
>seq2
KQKLSKANKNPNGSNNDNNVSSSSSASGKSNCYIMKENEKLRKKAELLNQENQQLL

Extract sequences from sample_nucl.fasta BLAST database and redirect ouptut to a file

# extract single sequence
blastdbcmd -db sample -entry seq1,seq2 -out out.fasta

Extract all sequences from BLAST database

Extract all sequence from sample_nucl.fasta BLAST database and redirect ouptut to a file

# extract single sequence
blastdbcmd -db sample -entry all -out out.fasta

The out.fasta should contain the all sequences from the formatted BLAST database

Enhance your skills with courses on genomics and bioinformatics


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.