Heatmap in Python

Renesh Bedre    2 minute read

What is heatmap?

  • Continuous colormap where each color represents a specific set of values
  • Great way to visualize and identify statistically significant gene expression changes among hundreds to thousands of genes from different treatment conditions

How to create a heatmap using Python?

  • We will use bioinfokit v0.6 or later
  • Check bioinfokit documentation for installation and documentation
  • For generating a heatmap plot, I have used gene expression data published in Bedre et al. 2015 to identify gene expression changes (induced or downregulated) in response to fungal stress in cotton. (Read paper). Here’s you can download gene expression dataset used for plotting heatmap: dataset

Note: If you have your own dataset, you should import it as pandas dataframe. Learn how to import data using pandas

Now plot heatmap with hierarchical clustering using bioinfokit,

from bioinfokit import analys, visuz
# load dataset as pandas dataframe
df = analys.get_data('hmap').data
    Gene         A         B         C        D        E         F
0  B-CHI  4.505700  3.260360 -1.249400  8.89807  8.05955 -0.842803
1   CTL2  3.508560  1.660790 -1.856680 -2.57336 -1.37370  1.196000

# set gene names as index
df = df.set_index(df.columns[0])
              A         B         C        D        E         F
B-CHI  4.505700  3.260360 -1.249400  8.89807  8.05955 -0.842803
CTL2   3.508560  1.660790 -1.856680 -2.57336 -1.37370  1.196000

# heatmap with hierarchical clustering 
visuz.gene_exp.hmap(df=df, dim=(3, 6), tickfont=(6, 4))

# heatmap without hierarchical clustering 
visuz.gene_exp.hmap(df=df, rowclus=False, colclus=False, dim=(3, 6), tickfont=(6, 4))
# heatmaps will be saved in same directory
# set parameter show=True, if you want view the image instead of saving

Generated heatmaps with and without hierarchical clustering by above code,

The X-axis represents the treatment conditions and Y-axis represents the gene names. I have changed the names of six treatment conditions to A to F for the simplicity of understanding. You can Read paper for a detailed understanding of the dataset.

Now plot heatmap with different colormaps,

# colormaps are available at  https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html
# default is seismic 
# here I use red-yellow-green: RdYlGn
visuz.gene_exp.hmap(df=df, cmap='RdYlGn', dim=(3, 6), tickfont=(6, 4))

# heatmap without hierarchical clustering 
visuz.gene_exp.hmap(df=df, rowclus=False, colclus=False, cmap='RdYlGn', dim=(3, 6), tickfont=(6, 4))

Generated heatmaps with a red-yellow-green colormap,

Now plot heatmap with standardized values,

# Z-score can be used to standardize value with mean 0 and var 1
# default Z-score is set to None and it applies to only heatmap with cluster
# here I standardize column with Z-score
visuz.gene_exp.hmap(df=df, zscore=1, dim=(3, 6), tickfont=(6, 4))

# here I standardize row with Z-score
visuz.gene_exp.hmap(df=df, zscore=0, dim=(3, 6), tickfont=(6, 4))

Generated heatmaps with Z standardized column and row,

In addition to these features, we can also control the label fontsize, figure size, resolution, figure format, and scale of the heatmaps.

Check detailed usage


  • Michael Waskom, Olga Botvinnik, Joel Ostblom, Saulius Lukauskas, Paul Hobson, MaozGelbart, … Constantine Evans. (2020, January 24). mwaskom/seaborn: v0.10.0 (January 2020) (Version v0.10.0). Zenodo. http://doi.org/10.5281/zenodo.3629446
  • Bedre R, Rajasekaran K, Mangu VR, Timm LE, Bhatnagar D, Baisakh N. Genome-wide transcriptome analysis of cotton (Gossypium hirsutum L.) identifies candidate gene signatures in response to aflatoxin producing fungus Aspergillus flavus. PLoS One. 2015;10(9).

This work is licensed under a Creative Commons Attribution 4.0 International License