Convert FASTQ to FASTA Format

Renesh Bedre    2 minute read

The FASTQ and FASTA file formats are widely used in bioinformatics data analysis.

In a FASTQ file, the nucleotide sequences and quality scores are stored, while in a FASTA file, only the nucleotide sequence information is stored.

Using one of these tools, you can convert a FASTQ file into a FASTA file:

seqtk

You can use the seqtk seq to convert FASTQ to FASTA as follows:

# with compressed FASTQ
seqtk seq -a sample.fastq.gz > sample.fasta

# with uncompressed FASTQ
seqtk seq -a sample.fastq > sample.fasta

reformat (from BBTools)

You can use the reformat.sh from BBTools to convert FASTQ to FASTA as follows:

reformat.sh in=sample.fastq out=sample.fasta

If you have paired-end reads, you can obtain FASTA files for both reads (read1 and read2) simultaneously

reformat.sh in1=read1.fastq in2=read2.fastq out1=read1.fasta out2=read2.fasta

seqret

You can use the seqret (from EMBOSS) tool to convert FASTQ to FASTA as follows:

seqret -sequence sample.fastq -outseq sample.fasta

fastq_to_fasta

You can use the fastq_to_fasta (from FASTX-Toolkit) tool to convert FASTQ to FASTA as follows:

fastq_to_fasta -Q 33 -i sample.fastq -o sample.fasta

The -Q 33 parameter indicates Illumina sequence format (Phred +33).

bioinfokit

bioinfokit is a Python package that can be used for FASTQ to FASTA conversion as below,

# import package
from bioinfokit import analys

# convert FASTQ to FASTA
analys.format.fqtofa(file="sample.fastq")

The output FASTA file will be saved as output.fasta in the same directory

awk

awk can be used for FASTQ to FASTA conversion as below,

awk "NR%4 == 1 || NR%4 == 2" sample.fastq | tr "@" ">"  > sample.fasta

Detailed example for FASTQ to FASTA using seqtk

The following example demonstrates how to convert FASTQ to FASTA with a sample FASTQ file. You can download the sample FASTQ file using this link.

View the FASTQ file,

# view first few sequences
head sample.fastq

@SRR22309490.1 1 length=101
CTGTTTTGTCTATTTTTGTTTGGTGCATTAGCTCCAATTGTGAACGTTAATTATGGAGGAATTAGTGGTGCTTTTTATGGGAACTATAGATCTAATTATAT
+SRR22309490.1 1 length=101
AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
@SRR22309490.2 2 length=101
ACCGTATATGTTTTCTATGTTCTCCACCGCAACATACTCTCCTTGTGAGAGTTTAAAGATATTCTTCTTCCTGTCAATTATCTTCATGCTTCCATCTGGTT
+SRR22309490.2 2 length=101
<AAF<J7<<JJJJJJJJFJFF<FJFFJJJJJJJJJJJFJ-FJJFJJJJJJJJJJJFJJF<FJJJJJJJJFJJJJJJJJJJFFJJFFAJJFJFFJJ<FF-FA
@SRR22309490.3 3 length=101
CTCCACTACTATCTCTTCTTCTTTGGAATATCTCCACGGAAAATCATCTTCACAAAAGCGAGATATTCCATTATCGCACCAAAAGTGTCTATGTGAACCCA

Now, convert FASTQ to FASTA using seqtk

# convert FASTQ to FASTA
seqtk seq -a sample.fastq > sample.fasta

# view first few sequences from FASTA
head sample.fasta

>SRR22309490.1 1 length=101
CTGTTTTGTCTATTTTTGTTTGGTGCATTAGCTCCAATTGTGAACGTTAATTATGGAGGAATTAGTGGTGCTTTTTATGGGAACTATAGATCTAATTATAT
>SRR22309490.2 2 length=101
ACCGTATATGTTTTCTATGTTCTCCACCGCAACATACTCTCCTTGTGAGAGTTTAAAGATATTCTTCTTCCTGTCAATTATCTTCATGCTTCCATCTGGTT
>SRR22309490.3 3 length=101
CTCCACTACTATCTCTTCTTCTTTGGAATATCTCCACGGAAAATCATCTTCACAAAAGCGAGATATTCCATTATCGCACCAAAAGTGTCTATGTGAACCCA
>SRR22309490.4 4 length=101
CCATGACCTTGGATACAACTTGCCTAGTGGGTCATGGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCGTATCTCGTATGCCGTCTTCTGCT
>SRR22309490.5 5 length=101
CTCGCAGTTGACTCATACTTAGCTCTATCGGTTTTGTACATGTGAGCAATCTCTGGAACCAATGGATCATCTGGGTTTGGGTCCGTTAACAATGAACATAT

Similarly, you can convert FASTQ files to FASTA files using other tools described above.

Enhance your skills with courses on genomics and bioinformatics

If you enhanced your knowledge and practical skills from this article, consider supporting me on

Buy Me A Coffee

Subscribe to get new article to your email when published

* indicates required


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.