MA plot to visualize gene expression data using Python

3 minute read

What is MA plot?

  • MA plot is a 2-dimensional (2D) scatter plot used for visualizing gene expression datasets
  • MA plot visualize and identify gene expression changes from two different conditions (e.g. normal vs. treated) in terms of log fold change (M) on Y-axis and log of the mean of normalized expression counts of two conditions on X-axis. Generally, genes with lower mean expression values will have highly variable log fold changes.
  • Genes with similar expression values in both normal and treated samples will cluster around M=0 value i.e genes expressed with no significant differences in between treatments
  • Points away from M=0 line indicate genes with significant expression, For example, a gene is upregulated and downregulated if the point is above and below M=0 line respectively
  • MA plot does not consider statistical measures (p values or adjusted p values) and therefore we can not tell genes with statistically significant differences between normal vs. treated from MA plot (Use Volcano plot if you want indicates genes with statistically significant differences)

How to create MA plot in Python?

  • We will use bioinfokit v2.0.7 or later
  • Check bioinfokit documentation for installation and documentation
  • For generating the MA plot, I have used gene expression data published in Bedre et al. 2016 to identify statistically significantly induced or downregulated genes in response to salt stress in Spartina alterniflora (Read paper). Here’s you can download gene expression dataset used for generating MA plot: dataset

Note: If you have your own dataset, you should import it as pandas dataframe. Learn how to import data using pandas

from bioinfokit import analys, visuz
# load dataset as pandas dataframe
df = analys.get_data('ma').data
df.head(2)
          GeneNames  value1  value2    log2FC       p-value
0  LOC_Os09g01000.1    8862   32767 -1.886539  1.250000e-55
1  LOC_Os12g42876.1    1099     117  3.231611  1.050000e-55

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value')
# plot will be saved in same directory (ma.png)
# set parameter show=True, if you want view the image instead of saving

Generated MA plot by adding above code (green: upregulated and red: downregulated genes),

MA plots

Change background theme to ggplot,

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', theme='ggplot')

MA plot with ggplot background

Add legend to the plot,

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', plotlegend=True)

MA plot with legend

Change color of MA plot

# change colormap
visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', plotlegend=True, 
    color=('#00239CFF', 'grey', '#E10600FF'))

MA plot with different colors

Change log fold change threshold for up and downregulated genes,

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', lfc_thr=(2,2), 
    plotlegend=True, color=('#00239CFF', 'grey', '#E10600FF'))

MA plot with different log threshold

Add gene labels (text style) to the points,

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', plotlegend=True, 
                        geneid="GeneNames", genenames=("LOC_Os09g01000.1", "LOC_Os01g50030.1"))
# see more parameters to control adding labels in usage here https://github.com/reneshbedre/bioinfokit#ma-plot

MA plot with labels

Change the shape of the points

# add star shape
# check more shapes at https://matplotlib.org/3.1.1/api/markers_api.html
visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', 
                        color=('#00239CFF', 'grey', '#E10600FF'), markerdot='*', plotlegend=True)

MA plot with different shape of points

Change the transparency of the points

# add star shape
# check more shapes at https://matplotlib.org/3.1.1/api/markers_api.html
visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', 
                        color=('#00239CFF', 'grey', '#E10600FF'), markerdot='*', valpha=0.5, plotlegend=True)

MA plot with change in point transparency

Draw log fold change threshold lines

# change colormap
visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1',  st_count='value2', pv='p-value', 
                        color=('#00239CFF', 'grey', '#E10600FF'), fclines=True, plotlegend=True)

MA plot with threshold lines

Add title to the plot,

# change colormap
visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1',  st_count='value2', pv='p-value', 
                        color=('#00239CFF', 'grey', '#E10600FF'), fclines=True, plotlegend=True, title='MA plot')

MA plot with title

Change X and Y range ticks, font size and name for tick labels

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', color=('#00239CFF', 'grey', '#E10600FF'), 
    markerdot='*', figtype='svg', xlm=(0,16,1), ylm=(-6,6,1), axtickfontsize=10, axtickfontname='Verdana', plotlegend=True)

MA plot with change in axis tick font size

Change legend position and labels

visuz.GeneExpression.ma(df=df, lfc='log2FC', ct_count='value1', st_count='value2', pv='p-value', color=('#00239CFF', 'grey', '#E10600FF'),
    plotlegend=True, legendpos='lower right', legendlabels=['Upregulated', 'Normal', 'Downregulated'])

MA plot with change in legend position

In addition to these parameters, the parameters for figure type (figtype), X and Y axis ticks range (xlm, ylm), axis labels (axxlabel, axylabel),
axis labels font size and name (axlabelfontsize, axlabelfontname), and axis tick labels font size and name (axtickfontsize, axtickfontname) can be provided.

Check detailed usage

If you have any questions, comments or recommendations, please email me at reneshbe@gmail.com


This work is licensed under a Creative Commons Attribution 4.0 International License