Convert FASTQ to FASTA Format

Renesh Bedre    2 minute read

The FASTQ and FASTA file formats are widely used in bioinformatics data analysis.

In a FASTQ file, the nucleotide sequences and quality scores are stored, while in a FASTA file, only the nucleotide sequence information is stored.

Using one of these tools, you can convert a FASTQ file into a FASTA file:

seqtk

You can use the seqtk seq to convert FASTQ to FASTA as follows:

# with compressed FASTQ
seqtk seq -a sample.fastq.gz > sample.fasta

# with uncompressed FASTQ
seqtk seq -a sample.fastq > sample.fasta

reformat (from BBTools)

You can use the reformat.sh from BBTools to convert FASTQ to FASTA as follows:

reformat.sh in=sample.fastq out=sample.fasta

If you have paired-end reads, you can obtain FASTA files for both reads (read1 and read2) simultaneously

reformat.sh in1=read1.fastq in2=read2.fastq out1=read1.fasta out2=read2.fasta

seqret

You can use the seqret (from EMBOSS) tool to convert FASTQ to FASTA as follows:

seqret -sequence sample.fastq -outseq sample.fasta

fastq_to_fasta

You can use the fastq_to_fasta (from FASTX-Toolkit) tool to convert FASTQ to FASTA as follows:

fastq_to_fasta -Q 33 -i sample.fastq -o sample.fasta

The -Q 33 parameter indicates Illumina sequence format (Phred +33).

bioinfokit

bioinfokit is a Python package that can be used for FASTQ to FASTA conversion as below,

# import package
from bioinfokit import analys

# convert FASTQ to FASTA
analys.format.fqtofa(file="sample.fastq")

The output FASTA file will be saved as output.fasta in the same directory

awk

awk can be used for FASTQ to FASTA conversion as below,

awk "NR%4 == 1 || NR%4 == 2" sample.fastq | tr "@" ">"  > sample.fasta

Detailed example for FASTQ to FASTA using seqtk

The following example demonstrates how to convert FASTQ to FASTA with a sample FASTQ file. You can download the sample FASTQ file using this link.

View the FASTQ file,

# view first few sequences
head sample.fastq

@SRR22309490.1 1 length=101
CTGTTTTGTCTATTTTTGTTTGGTGCATTAGCTCCAATTGTGAACGTTAATTATGGAGGAATTAGTGGTGCTTTTTATGGGAACTATAGATCTAATTATAT
+SRR22309490.1 1 length=101
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@SRR22309490.2 2 length=101
ACCGTATATGTTTTCTATGTTCTCCACCGCAACATACTCTCCTTGTGAGAGTTTAAAGATATTCTTCTTCCTGTCAATTATCTTCATGCTTCCATCTGGTT
+SRR22309490.2 2 length=101
<AAF<J7<<JJJJJJJJFJFF<FJFFJJJJJJJJJJJFJ-FJJFJJJJJJJJJJJFJJF<FJJJJJJJJFJJJJJJJJJJFFJJFFAJJFJFFJJ<FF-FA
@SRR22309490.3 3 length=101
CTCCACTACTATCTCTTCTTCTTTGGAATATCTCCACGGAAAATCATCTTCACAAAAGCGAGATATTCCATTATCGCACCAAAAGTGTCTATGTGAACCCA

Now, convert FASTQ to FASTA using seqtk

# convert FASTQ to FASTA
seqtk seq -a sample.fastq > sample.fasta

# view first few sequences from FASTA
head sample.fasta

>SRR22309490.1 1 length=101
CTGTTTTGTCTATTTTTGTTTGGTGCATTAGCTCCAATTGTGAACGTTAATTATGGAGGAATTAGTGGTGCTTTTTATGGGAACTATAGATCTAATTATAT
>SRR22309490.2 2 length=101
ACCGTATATGTTTTCTATGTTCTCCACCGCAACATACTCTCCTTGTGAGAGTTTAAAGATATTCTTCTTCCTGTCAATTATCTTCATGCTTCCATCTGGTT
>SRR22309490.3 3 length=101
CTCCACTACTATCTCTTCTTCTTTGGAATATCTCCACGGAAAATCATCTTCACAAAAGCGAGATATTCCATTATCGCACCAAAAGTGTCTATGTGAACCCA
>SRR22309490.4 4 length=101
CCATGACCTTGGATACAACTTGCCTAGTGGGTCATGGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCT
>SRR22309490.5 5 length=101
CTCGCAGTTGACTCATACTTAGCTCTATCGGTTTTGTACATGTGAGCAATCTCTGGAACCAATGGATCATCTGGGTTTGGGTCCGTTAACAATGAACATAT

Similarly, you can convert FASTQ files to FASTA files using other tools described above.

Enhance your skills with courses on genomics and bioinformatics


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.