Posted in Data visualisation, REVIGO

REVIGO: A useful tool to summarise long lists of Gene Ontology terms

Scatterplot produced by REVIGO, showing the relatedness between the different GO terms. Data based on day2 vs day 0 analysis in subjects vaccinated with YF17D, published by Jue Hou et al., JI, 2017.

After Enrichr analysis, you could be faced with a scenario where you have >30 significant Gene Ontology (GO) pathway hits. Presenting all of these pathways in a figure panel could be potentially challenging, or even distracting if many of these significant hits belong to similar biological pathways. In this entry, I recommend the use of REVIGO, which is a Web server that summarises long lists of GO terms into a representative subset of terms. This clustering is based on an algorithm that relies on semantic similarity measures. The non-redundant GO terms are also visualised in a scatterplot, interactive graph, treemap or tag cloud to ascertain their degree of similarity or differences. As a result, REVIGO mitigates the problem of having long lists of significant GO terms belonging to highly similar pathways, allowing you to focus on unique GO pathways that are influenced by the treatment conditions. For instance, consider the following example where human subjects given the yellow fever live-attenuated vaccine results in upregulated DEGs at day 2 post-vaccination (Jue Hou et al., JI, 2017). The list of GO pathways derived from Enrichr based on adjusted p-value < 0.05 are as follows:

GO:0071357cellular response to type I interferon 
GO:0060337type I interferon signaling pathway 
GO:1903901negative regulation of viral life cycle 
GO:2000112regulation of cellular macromolecule biosynthetic process 
GO:1903506regulation of nucleic acid-templated transcription 
GO:0045071negative regulation of viral genome replication 
GO:0045069regulation of viral genome replication 
GO:0035455response to interferon-alpha 
GO:0010468regulation of gene expression 
GO:0019221cytokine-mediated signaling pathway 
GO:0060333interferon-gamma-mediated signaling pathway 
GO:0097696STAT cascade 
GO:0034340response to type I interferon 
GO:0006355regulation of transcription, DNA-templated 
GO:0006974cellular response to DNA damage stimulus 
GO:0006400tRNA modification 
GO:0071346cellular response to interferon-gamma 
GO:0030488tRNA methylation 
GO:0035456response to interferon-beta 
GO:0006281DNA repair 
GO:0032446protein modification by small protein conjugation 
GO:0048525negative regulation of viral process 
GO:0002230positive regulation of defense response to virus by host 
GO:0016567protein ubiquitination 
GO:0050691regulation of defense response to virus by host 
GO:0001510RNA methylation 
GO:0032543mitochondrial translation 
GO:0007259JAK-STAT cascade 
GO:0001817regulation of cytokine production 
GO:0006415translational termination 

This is a long redundant list of 30 GOBP pathways that is difficult to present in a scientific publication. It is also incorrect to conveniently remove significant pathways from this list without proper justification. In this case, REVIGO provides one of the best solutions to resolve these issues. By putting all these significant GO terms into REVIGO, the program summarises this long list of pathways into a smaller list of 16 non-redundant pathways:

GO:0071357cellular response to type I interferon 
GO:0035455response to interferon-alpha 
GO:0097696STAT cascade 
GO:0034340response to type I interferon 
GO:0006355regulation of transcription, DNA-templated 
GO:0006974cellular response to DNA damage stimulus 
GO:0071346cellular response to interferon-gamma 
GO:0035456response to interferon-beta 
GO:0048525negative regulation of viral process 
GO:0002230positive regulation of defense response to virus by host 
GO:0016567protein ubiquitination 
GO:0001510RNA methylation 
GO:0032543mitochondrial translation 
GO:0007259JAK-STAT cascade 
GO:0001817regulation of cytokine production 
GO:0006415translational termination 

Indeed, the scatterplot (see diagram on top) depicts that this shortlisted list of GO terms are non-overlapping, and indicate that these pathways will likely be modulated by different sets of DEGs. This allows the researcher to report on the restricted list of GO terms, and their associated DEGs more concisely, without losing biological meaning.

Overall, REVIGO is the tool of choice for users who wish to be able to quickly summarise their long list of GO terms, to better understand the distinct pathways that are differentially modulated by various infection or treatment conditions.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s