As described in my previous blog post, heatmaps allow quick visualisation of various measurements between samples. The magnitude differences are usually represented as hue or colour intensity changes. However, if you want to include another parameter, this can be challenging. Imagine the scenario where you identified 5 Gene Ontology Biological Pathways (GOBP) which are significantly different between the infected and uninfected samples over a course of 3 days. To plot them on a graph, you can choose to negative log-transform the adjusted p-values and then plot a heatmap as shown below:
However, if you want to also display the combined score from your EnrichR analysis, you will have to plot another heatmap:
As shown in the example above, you will need 2 figure panels to fully describe your pathway analysis. A more elegant way to display these results could thus be to use a dot plot. In simple terms, dot plots are a form of data visualisation that plots data points as dots on a graph. The advantage of plotting data points in dots rather than rectangles in heatmaps is that you can alter the size of the dots to add another dimension to your data visualisation. For instance, in this specific example, you can choose to display the p-values to be proportional to the size of the dots and the hue of the dots to represent enrichment score. This also means you only need one graph to fully represent the pathway analysis!
Dot plots can be easily plotted in Python, using either the Seaborn package or the Plotly Express package. I personally prefer the Plotly Express package as the syntax is simpler and you can mouse over the dots to display the exact values. To plot the dot plot, we first load the standard packages:
import csv import numpy as np import pandas as pd import matplotlib.pyplot as plt import plotly.express as px
We then load a ‘test’ dataset from my desktop, into a format where columns will contain timepoints, pathway terms, negative logP values and combined scores. It is also a good habit to convert the timepoint to a “string” datatype so that the x-axis does not include the default time-points such as 1.5 and 2.5.
df = pd.read_csv('/Users/kuanrongchan/Desktop/test.csv') df['timepoint'] = pd.Series(df['timepoint'], dtype="string") df.head(5)
Output is as follows:
|0||1||Defense Response To Virus||3.942000e-25||87.531||24.404283|
|1||2||Defense Response To Virus||3.940000e-27||875.310||26.404283|
|2||3||Defense Response To Virus||5.000000e-02||2.000||1.301030|
|3||1||Defense Response To Symbiont||2.256000e-25||95.555||24.646661|
|4||2||Defense Response To Symbiont||2.260000e-27||955.550||26.646661|
Finally, the dot plot can be plotted using the following syntax:
fig = px.scatter(df, x="timepoint", y="Term", color="Combined_score", size="neg_logP", color_continuous_scale=px.colors.sequential.Reds) fig.show()
Output is a dotplot, where size is proportional to the -log p-value and the colour intensity. You can choose to customise your colours available at this website:
Because of the ability of the dot-plot to add another dimension of analysis, most pathway analysis are presented as dot-plots. However, I am sure there are other scenerios where dot plots can be appropriately used. Next time, if you decide to plot multiple heatmaps, do consider the possibility of using dot-plots as an alternative data visualisation tool!