Posted in Data visualisation, Volcano Plot

Plotting volcano plots with VolcaNoseR

Volcano plot displaying genes that were most differentially expressed at day 7 post-nadir relative to day 0 in the severe COVID-19 patients. VolcaNoseR was used for graph plotting. Source: Ong EZ et al., eBioMedicine, 2021 

As described in my previous post, using differentially expressed genes (DEGs) to analyse biological datasets does not provide information on the magnitude or extent of gene changes. In contrast, a volcano plot, which is a scatterplot of -log10(Adjusted p-value) against log2(Fold change), allows visualisation of the distribution of DEGs and the DEGs that are most differentially expressed. The genes with greatest fold changes and significant p-values (p<0.05) are also ideal targets for validation.

The volcano plot is comprised of a two-step procedure. First, fold change is determined by taking the ratio of the gene abundance in the treatment group to the control group, followed by a log2 transformation to obtain a normal or near-normal distribution. Values > 0 are considered as upregulated genes, whereas values < 0 are downregulated. Second, an adjusted p-value, corrected for multiple correction, is used to calculate if the gene expression changes between the treatment and control groups are significantly different. This is then followed by a -log10 transformation for normalisation, to obtain the -log(adjusted p-value).

Volcano plots can be plotted using excel or more specialised biostatistics tools such as Prism-GraphPad. However, manual annotation of genes with largest fold changes and p values can be laborious. R scripts using the EnhancedVolcano R package can be used. Alternatively, I recommend the use of VolcaNoseR, as it allows greater ease of creating and labelling volcano plots. VolcaNoseR is an user-friendly open source web app that allows one to quickly change the fold change or p-value thresholds, as well as quick annotation of genes with greatest fold changes and p-values. I have personally tried plotting volcano plots using VolcaNoseR with 30,000 genes without experiencing serious lag issues. However, a disadvantage with this tool is that the annotation of genes may overlap if the gene names are too long.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s