Posted in Data visualisation, gProfiler

gProfiler: A useful webtool to query multiple lists of differentially expressed genes

Manhattan plot showing the enriched pathways modulated by AS01B after day 1 post-vaccination (first dose) with the Hepatitis B inactivated vaccine by gProfiler. Different adjuvants, AS01B, AS01E, AS03, AS04 and Alum were tested. No DEGs detected in AS04 and Alum. Data based on Laurane et al., 2020, Science Translational Medicine.

Comparing multiple variables with microarray or RNAseq often generates different sets of differentially expressed genes (DEGs). To examine the similarities and differences in molecular pathways modulated by these sets of DEGs, tools such as Enrichr or DAVID can be used to investigate the individual pathways involved. A Venn diagram can then be used to extract the intersecting or unique pathways involved.

While such comparisons are easy to perform if you only have a few comparisons to make, this method of analysis can be time-consuming when the query size becomes larger. Therefore, in scenarios where multiple comparison of DEGs are required, I recommend using gProfiler for analysis of enriched pathways. One major strength of gProfiler is to be able to submit multiple query lists, by simply adding a “>” symbol between your gene query lists.

To provide more context, consider that you are comparing the day 1 responses between the different adjuvants, published by Laurane et al. After determining the DEGs for the different conditions based on a defined cutoff (Fold change > 1.3, adjusted p-value < 0.05 in this case), we can input the DEGs into gProfiler for the respective conditions. Once the “Run as multiquery” box is checked, gProfiler generates manhattan plots for all the different conditions, showing the enriched pathways that are detected across various databases such as GOMF, GOBP, GOCC, KEGG, Reactome, as well as their respective p-values (See top figure for the output). The corresponding p-values for the other conditions are reflected on the y-axis.

The p-values for all enriched pathways can also be analysed in a table form (see below). Indeed, consistent with the publication findings, AS01B and AS01E behave more similarly as compared to AS03, although most of the pathways among all three adjuvants are overlapping.

Pathway analysis with Reactome using gProfiler comparing pathways modulated by AS01B, AS01E and AS03 after 1 post-vaccination

Besides gene queries against the default databases, you can also use gProfiler to query against any other custom databases, by uploading the custom GMT file. In addition, there are other functions that gProfiler can perform. For instance, g:Convert can convert gene IDs into gene symbols. g:SNPense can also provide information of the SNP queries.

Overall, gProfiler is a useful webtool that allows quick assessment of enriched pathways across multiple variables. A limitation, however, is that the genes involved in the pathways are not shown. To circumvent this, the results from gProfiler can be complemented with Enrichr to identify the gene transcripts responsible for the enriched pathways.