Convert Multi-line Fasta into Single-line Fasta

Renesh Bedre    1 minute read

Most FASTA files obtained from biological databases contain sequences in multi-line format, but some bioinformatics tools and scripts require single-line FASTA files.

You can use the multi_to_single_line() function from Python bioinfokit package (v2.1.3) for converting multi-line FASTA into single-line FASTA.

The general syntax of looks like this:

# load package
from bioinfokit.analys import Fasta

# convert multi line FASTA into single line FASTA
Fasta.multi_to_single_line(file="eg.fasta")

The above function generates an output file (output.fasta) in the same directory and contains the sequences in one line.

The following examples explain how to use the multi_to_single_line() function,

For example, if you have the following multi-line FASTA,

head eg.fasta

>seq
GAATGAGATTATTCTCATAGCGAAGCTTCAACATCGGAATCTTGTGAGATTACTTGGATGTTGCTTCGAG
GGAGAAGAGAAAATGCTTGTTTATGAGTATATGCCTAACAAGAGCTTGGATTTCTTCCTCTTTGATGAAA

Now convert it to single-line FASTA using the multi_to_single_line() function

# load package
from bioinfokit.analys import Fasta

# convert multi line FASTA into single line FASTA
Fasta.multi_to_single_line(file="eg.fasta")

The single-line FASTA file (output.fasta) will be saved in the same directory.

head output.fasta

>seq
GAATGAGATTATTCTCATAGCGAAGCTTCAACATCGGAATCTTGTGAGATTACTTGGATGTTGCTTCGAGGGAGAAGAGAAAATGCTTGTTTATGAGTATATGCCTAACAAGAGCTTGGATTTCTTCCTCTTTGATGAAA

Enhance your skills with courses on genomics and bioinformatics


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.