Posted in Resource

Systems vaccinology publications compiled

mRNA (Messenger ribonucleic acid): Disrupting the field of vaccinology | GSK

This year, I was very interested in systems vaccinology, and have placed a lot of my efforts summarising various systems vaccinology papers which I found interesting. However, the current layout of my blog didn’t allow presentation of all these publications in a format that users can quickly access to them. I have thus compiled them in my medium publication account, and in a systematic fashion in my medium website. Feel free to visit these small compilations of my blog entries from the hyperlink. Cheers!

Posted in python

Python workflow for omics data analysis

Bioinformatic Python codes for volcano plot, DEG and heatmap analysis

Thank you all for the great interest in my blog. For the year end, I have decided to compile the blog posts that I have made for this year so we can revise what we have learnt before starting the next year with new content. If you have followed my posts, a lot of emphasis has been placed in data analysis of omics data, including volcano plot, DEG identification and heatmap analysis. I have thus re-written some of these posts in Medium, curating the codes so that readers can also implement them easily. Furthermore, a concrete example dataset is provided in GitHub to bring context into the codes used!

For Python codes on volcano plots, please click here.

For Python codes on DEGs, please click here.

For Python codes on clustergrams and heatmaps, please click here.

All of the above visualisation tools are important before doing any downstream analysis, including pathway enrichment analysis, interaction maps, gene network etc., which I will be covering more next year!

Posted in Data visualisation, python

Using Seaborn library to plot clustergrams

Unsupervised hierachical clustering can also allow direct visualisation of clusters without the need for dimensionality reduction. The full article can be found here.

There are two common strategies to for data normalisation. One way is to plot the heatmap or clustergram based on log2 fold-change instead of absolute abundance. The comparisons can be performed against time-point = 0 (baseline) for temporal studies, or against a control/placebo experiment (for static studies). However, the disadvantage of using this method is that the distribution of baseline or controls cannot be easily visualised.

Another alternative is to perform a Z-score transformation and plot the Z-scores in the heatmap. The Z-score transformation converts each gene or protein across all conditions with mean = 0 and standard deviation = 1, enabling users to easily compare expression values across multiple genes and protein. Values that are above 0 means the gene expression level is above mean expression, whereas values below 0 means that the gene expression level is below mean expression. However, the limitation is that Z-scores can only be applied for experiments that are obtained within a single experiment. Hence, the choice of either method for data normalisation is highly dependent on the research question and objectives.

Before jumping straight into plotting heatmap or clustergrams for your data, it is generally a good idea to first filter the dataset, especially when you have a large number of genes. The reason is because in most scenarios, the large majority of the genes or proteins remain unchanged, which will consequently impact the ability for the unsupervised hierarchical clustering to separate gene expression profiles based on your experimental conditions. To circumvent this limitation, you can choose to filter the dataset based on differentially expressed genes or enriched biological pathways before plot the heatmaps or clustergrams.

To plot heatmaps and clustergrams using Python, we first load the required packages. In this blog entry, we will be using the Seaborn library to plot the heatmaps and clustergrams:

import numpy as np
import pandas as pd
import seaborn as sns

Similar to my previous blog entries, we will use the transcriptomics dataset published by Zak et al., PNAS, 2012, examining how seropositive and seronegative subjects respond to the Ad5 vaccine across various time points. The summary of the study can be found here, and the processed dataset, which was analysed by Partek Genomics Suite can be found in GitHub. The fold change, ratio, p-value and adjusted p-values (q-value) are calculated with respect to baseline (timepoint = 0).

We will load and inspect the processed dataframe from GitHub. It is important to label the gene column as the index column so that gene names can be referred to in the clustergram or heatmap. The commands are as follows:

df = pd.read_csv('',index_col=0)

The output file shows the values of the p-value (pval), adjusted p-values (qval), ratio, and fold change (fc) for 6 hours, 1-day, 3-day and 7-day time points compared to baseline (timepoint = 0):

As described above, it is important to normalise the dataset to ensure that the relative expression is comparable between different genes. Here, we will use log2 fold-change for normalisation and create log2 fold-change (log2FC) columns in the dataframe:

df['log2FC_6h'] = np.log2(df['ratio_6h'])
df['log2FC_1d'] = np.log2(df['ratio_1d'])
df['log2FC_3d'] = np.log2(df['ratio_3d'])
df['log2FC_7d'] = np.log2(df['ratio_7d'])

There are a total of 17,623 genes that are measured. To visualise the comparisons between time-points better, we will filter the dataset. Since we have previously ascertained that day 1 has the most differentially expressed genes (DEGs), we could filter the dataset based on upregulated DEGs (with fold-change > 1.5, adjusted p-value < 0.05). The filtered dataframe is saved under DEGs_up_1d. Since we are interested in plotting the log2-fold change values, we will select the log2FC columns and remove all other columns. The code is as follows:

DEGs_up_1d = df[(df['fc_1d'] > 1.5) & (df['qval_1d'] < 0.05)]
DEGs_up_1d = DEGs_up_1d.filter(items=['log2FC_6h','log2FC_1d', 'log2FC_3d', 'log2FC_7d'])

To plot the clustergram, the codes are as follows:

from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable
from mpl_toolkits.axes_grid1.colorbar import colorbar
g = sns.clustermap(DEGs_up_1d, cmap='vlag', method='average', vmin=-2, vmax=2, yticklabels=False)

There are a few Seaborn settings that are displayed in the code, and the documentation can be found here. The colour map to be ‘vlag’ was chosen as it would allow us to have a heatmap where increased expression is red and reduced expression is blue. Note that I have also assigned maximum and minimum value of 2 and -2 respectively, as I wanted to ensure that log2FC = 0 is white (to signify no change). yticklabels = False was chosen because it is near impossible to see all 826 gene names in the clustergram. The output is as shown:

As expected, the day 1 signatures are the most distinct compared to other time-points. The 6 hour and day 7 signatures are clustered together, showing that these time-points have little or no changes in gene expression profile. Interestingly, some of the DEGs have prolonged expression up to day 3, while others resolve very quickly. Do we see the same trends for the downregulated DEGs? Let’s test it out with the following command:

DEGs_down_1d = df[(df['fc_1d'] < -1.5) & (df['qval_1d'] < 0.05)]
DEGs_down_1d = DEGs_down_1d.filter(items=['log2FC_6h','log2FC_1d', 'log2FC_3d', 'log2FC_7d'])
g = sns.clustermap(DEGs_down_1d, cmap='vlag', method='average', vmin=-2, vmax=2, yticklabels=False)

Similar patterns can be seen. However, unlike upregulated DEGs where some DEGs persisted to day 3, most of the downregulated DEGs returned to baseline levels at day 3.

It’s not so hard isnt it? 🙂

Posted in Data visualisation, IPyWidgets, python

Identifying differentially expressed genes with ipywidgets

Gene expression (or transcriptomics) profiling is the most common type of omics data. The identification of differentially expressed genes (DEGs) from transcriptomics data is critical to understanding the molecular driving forces or identifying the molecular biomarkers behind biological phenotypes. The full article on DEGs is fully summarised here.

Since we need to find out the most appropriate cut-off to identify DEGs, interactive Python codes that allow users to manipulate thresholds will be the most suitable to define the selection criteria for DEGs. In this blog entry, we will use IPython widgets (ipywidgets), which can generate sliders that allow users to set cutoffs or thresholds interactively. The output based on the selected cutoffs will then be generated in real-time for users to evaluate their suitability.

As a proof of concept, we will use the transcriptomics dataset published by Zak et al., PNAS, 2012, examining how seropositive and seronegative subjects respond to the Ad5 vaccine across various time points.

The summary of the study can be found here, and the processed dataset, which was analysed by Partek Genomics Suite can be found in GitHub. The fold change, ratio, p-value and adjusted p-values (q-value) are calculated with respect to baseline (timepoint = 0).

To install ipywidgets, simply type in the below command (you only need to do this once):

pip install ipywidgets

We can execute the ipywidgets within JupyterLab or Jupyter Notebooks. The instructions for downloading them can be found here. Next, we import the packages required for analysis:

import numpy as np
import pandas as pd
import ipywidgets as widgets
import as px
import plotly.graph_objects as go

Next, we will load and inspect the processed dataframe from GitHub. It is also important to label the gene column as the index column so that we can refer to the column for DEG counts. Commands are as follows:

df = pd.read_csv('',index_col=0)

The output file shows the values of the p-value (pval), adjusted p-values (qval), ratio, and fold change (fc) for 6 hours, 1-day, 3-day and 7-day time points compared to baseline (timepoint = 0):

Next, we will execute the ipywidgets command:

from ipywidgets import interact, interact_manual
def show_DEGs(column1=['fc_6h','fc_1d','fc_3d', 'fc_7d'], x=(1,3,0.1),column2=['qval_6h','qval_1d','qval_3d','qval_7d']):
 return print('Total DEGs:', len(df[(abs(df[column1]) > x) & (df[column2] < 0.05)].index))

The above code may seem intimidating, but you can break this down into bite sizes.

First, we import the tools from ipywidgets required to build the interactive features. The @interact decorator automatically creates a text box and a slider for choosing values from a designated column and number.

In this case, we created 2 columns (column1 and column2) to represent values in fold-change and adjusted p-values (q-values) respectively. The function x=(1,3,0.1) means that column1 values (which is fc values in this case) can be changed from 1 to 3, at intervals of 0.1. Finally, we will then print out the total number of DEGs (includes both up and downregulated DEGs) more than the value of x, and with a q-value < 0.05.

The output is as follows:

I have taken multiple snapshots so that we can appreciate the number of DEGs across different days. In this case, the total number of DEGs for each time-point with fold-change > 2 and adjusted p-values < 0.05 are shown. Consistent with the volcano plot, we see that the greatest number of DEGs are in day 1.

We can change the x value to 1.5 to visualise the number of DEGs with fold-change value > 1.5, adjusted p-values < 0.05. The output is as follows:

As expected, we see more DEGs as the fold-change cut-off is less stringent. Note that the number of DEGs in day 1 increased by approximately 3-fold when we reduced the fold-change value from 2 to 1.5, which is quite a big difference! It is thus more reasonable to use a fold-change of 1.5, in this case, to avoid losing too much data.

After we decide that a fold-change value of 1.5 and q-value < 0.05 is most appropriate for our analysis, we will plot these details in a stacked bar. We first calculate the number of upregulated (fold-change > 1.5, adjusted p-value < 0.05) and number of downregulated (fold-change < -1.5, adjusted p-value < 0.05) for each time-point. The commands are as shown:

DEGs_up_6h = len((df[(df['fc_6h'] > 1.5) & (df['qval_6h'] < 0.05)]).index)
DEGs_down_6h = len((df[(df['fc_6h'] < -1.5) & (df['qval_6h'] < 0.05)]).index)
DEGs_up_1d = len((df[(df['fc_1d'] > 1.5) & (df['qval_1d'] < 0.05)]).index)
DEGs_down_1d = len((df[(df['fc_1d'] < -1.5) & (df['qval_1d'] < 0.05)]).index)
DEGs_up_3d = len((df[(df['fc_3d'] > 1.5) & (df['qval_3d'] < 0.05)]).index)
DEGs_down_3d = len((df[(df['fc_3d'] < -1.5) & (df['qval_3d'] < 0.05)]).index)
DEGs_up_7d = len((df[(df['fc_7d'] > 1.5) & (df['qval_7d'] < 0.05)]).index)
DEGs_down_7d = len((df[(df['fc_7d'] < -1.5) & (df['qval_7d'] < 0.05)]).index)

Finally, we can plot the results in a stacked bar format using the Plotly graphing library, as Plotly allows users to hover over data points to query data point attributes. The commands are as follows:

days = ['6hrs', 'day 1', 'day 3', 'day 7']
fig = go.Figure(data=[
 go.Bar(name='downregulated', x=days, y=[DEGs_down_6h, DEGs_down_1d, DEGs_down_3d, DEGs_down_7d]),
 go.Bar(name='upregulated', x=days, y=[DEGs_up_6h, DEGs_up_1d, DEGs_up_3d, DEGs_up_7d])
title = 'DEGs in seronegative subjects',
yaxis_title='Number of DEGs',
 family='Arial', size=18))

I have done a few more customisations in the graph, including adding a title, y-axis titles and changing the font to make the graph look more professional. The output is as shown:

At one glance, you can quickly appreciate that day 1 has the most DEGs as compared to the other time points, and there are more upregulated compared to downregulated DEGs. You can also mouse over the data points to obtain the precise number of up and downregulated DEGs at every time point.

Posted in Data visualisation, python

Plotting volcano plots with Plotly

As mentioned previously, I have highlighted why volcano plots are so important in omics research. A full article summarising the main points and rationale are as described here. In this entry, we will explore how we can use Plotly to build volcano plots!

We will analyse a transcriptomics dataset published by Zak et al., PNAS, 2012. In this study, seronegative and seropositive subjects were given the MRKAd5/HIV vaccine, and the transcriptomic responses in peripheral blood mononuclear cells (PBMC) were measured at 6 hours, 1 day, 3 day and 7 days post-vaccination. On my end, I have processed and compiled the fold-change, ratio , p-values and adjusted p-values (q-values) of the seronegative subjects with respective to time = 0 using the Partek Genomics Suite, and the processed data is available in GitHub.

We first load the required packages (pandas, numpy and plotly) to plot our volcano plot using Python. Note that you will need to download the packages before you can start using them. I have chosen to use the Plotly graphing library for data visualisation, as Plotly allows users to hover over data points to query data point attributes. The commands are as follows:

import numpy as np
import pandas as pd
import plotly.graph_objects as go
import as px

Next, we will load and inspect the processed dataframe from GitHub. It is also important to label the gene column as the index column so that we can reference the specific points to gene names. Commands are as follows:

df = pd.read_csv(‘',index_col=0)

The output file shows the values of p-value (pval), adjusted p-values (qval), ratio, and fold change (fc) for 6 hours, 1 day, 3 day and 7 day time-points compared to baseline (timepoint = 0):

The next step will be to create new columns for log2FC and -log10(adjusted p-value). The commands are as follows:

df[‘log2FC_6h’] = np.log2(df[‘ratio_6h’])
df[‘log2FC_1d’] = np.log2(df[‘ratio_1d’])
df[‘log2FC_3d’] = np.log2(df[‘ratio_3d’])
df[‘log2FC_7d’] = np.log2(df[‘ratio_7d’])
df[‘negative_log_pval_6h’] = np.log10(df[‘qval_6h’]) * (-1)
df[‘negative_log_pval_1d’] = np.log10(df[‘qval_1d’]) * (-1)
df[‘negative_log_pval_3d’] = np.log10(df[‘qval_3d’]) * (-1)
df[‘negative_log_pval_7d’] = np.log10(df[‘qval_7d’]) * (-1)

Now we are ready to plot the volcano plots for the different time-points. The advantage of using Python is that we can overlay the volcano plots of the different time-points all in one graph. The commands are as follows:

fig = go.Figure()
trace1 = go.Scatter(
trace2 = go.Scatter(
 name=’day 1',
trace3 = go.Scatter(
 name=’day 3',
trace4 = go.Scatter(
 name=’day 7',
fig.update_layout(title=’Volcano plot for seronegatives’)

I will give a brief description of the above code. We first tell Plotly that we are going to plot a figure using the go.Figure() command. Next, we overlay the different scatterplots using the different traces (one for each time-point). In each trace, we specify (i) the columns for the x and y-axis, (ii) indicate that we only want to plot points using the mode= ‘markers’, (iii) indicate the figure legends for the different traces and (iv) indicate the text labels when we hover over the data points. Under the fig.update_layout, we also added the title of the plot. Output for the graph is thus as follows:

At one glance, you can quickly appreciate that day 1 has the most differentially expressed genes as compared to the other time-points. You can also mouse over the data points to examine which of these genes are most differentially expressed. An example I have provided is CXCL10, which is one of the genes that is induced to a great extent after one day post-vaccination.

Now that we have established that day 1 is the most interesting time-point, we may want to zoom into the genes that are at the edges of the volcano plot. We can type in the below command to specifically plot the day 1 volcano plot. Note that I have added the text in the volcano plot this time (using ‘textposition’ command) so that we can quickly visualise the genes that are most differentially expressed:

fig_d1 = px.scatter(df, x=’log2FC_1d’, y=’negative_log_pval_1d’, text=df.index)
fig_d1.update_traces(textposition=’top center’)
 title_text=’Volcano plot for seronegatives (day 1)’

Output file is as follows:

Now, we can quickly visualise that the top upregulated genes at day 1 include the interferon-related genes, such as IDO1, CXCL10, RSAD2, CCL2, CCL8 and LAMP3. As you can begin to appreciate, a volcano plot allows users to have a quick sensing of the data and visualisation of the most pivotal genes and proteins that are most differentially expressed.

Posted in Demographics signature

The transcriptional landscape of age in human peripheral blood

Figure 1
Molecular pathways that were most differentially regulated by age. Source is taken from Marjolein J Peters et al., 2016.

Chronological age is a major risk factor for many common diseases including heart disease, cancer and stroke, three of the leading causes of death.

Previously, APOE, FOXO3 and 5q33.3 were the only identified loci consistently associated with longevity.

The discovery stage included six European-ancestry studies (n=7,074 samples) with whole-blood gene expression levels (11,908 genes). The replication stage included 7,909 additional whole-blood samples. A total of 1,497 genes were found to be associated with age, of which 897 are negatively correlated and 600 are positively correlated.

Among the negatively age-correlated genes, three major clusters were identified. The largest group: Cluster #1, consisted of three sub-clusters enriched for (1a) RNA metabolism functions, ribosome biogenesis and purine metabolism; (1b) multiple mitochondrial and metabolic pathways including 10 mitochondrial ribosomal protein (MRP) genes and (1c) DNA replication, elongation and repair, and mismatch repair. Cluster #2 contained factors related to immunity; including T- and B-cell signalling genes, and genes involved in hematopoiesis. Cluster #3 include cytosolic ribosomal subunits.

The positively age-correlated genes revealed four major clusters. Cluster #1: Innate and adaptive immunity. Cluster #2: Actin cytoskeleton, focal adhesion, and tight junctions. Cluster #3: Fatty acid metabolism and peroxisome activity. Cluster #4: Lysosome metabolism and glycosaminoglycan degradation.

DNA methylation, measured by CpG methylation, was not associated with chronological age but associated with the gene expression levels. This result hint at the possibility that DNA methylation could be affecting regulation of gene expression.

Transcriptomic age and epigenetic age (both Hannum and Horvath) were positively correlated, with r-squared values varying between 0.10 and 0.33.

Posted in Resource, VSV vectors

Systems Vaccinology Identifies an Early Innate Immune Signature as a Correlate of Antibody Responses to the Ebola Vaccine rVSV-ZEBOV

Immunologic parameters that are correlated with antibody responses to rVSV-EBOV. Source from Rechtien et al., Cell Reports, 2017.

Predicting and achieving vaccine efficacy remains a major challenge. Here, Rechtien et al used a systems vaccinology approach to disentangle the early innate immune responses elicited by the Ebola vaccine rVSV-Zaire Ebola virus (ZEBOV) to identify innate immune responses correlating with Ebola virus (EBOV)-glycoprotein (GP)-specific antibody induction. Of note, this replication-competent recombinant vaccine candidate is based on the vesicular stomatitis virus (rVSV)-based vaccine vector, which has been shown safe and immunogenic in a number of phase I trials.

The vaccine rVSV-ZEBOV induced a rapid and robust increase in cytokine levels, with a maximum peak at day 1, especially for CXCL10, MCP-1 and MIP-1β. Assessment of PBMCs revealed significant induction of co-stimulatory molecules, monocyte/DC activation and NK cell activation at day 1 post-vaccination. The expression of these molecules begin to decline at day 3.

Interestingly, CXCL10 plasma levels and frequency of activated NK cells at day 3 were found to be positively correlated with antibody responses. CD86+ expression in monocytes and mDCs at day 3 are negatively correlated with antibody responses (See figure on top).

The most number of upregulated genes were detected at day 1 post-vaccination. Critically, the early gene signature linked to CXCL10 pathway, including TIFA (TRAF-interacting protein with forkhead-associated domain) on day 1, SLC6A9 (solute carrier family 6 member 9) on day 3, NFKB1 and NFKB2 were most predictive of antibody responses.

Data is stored under NCBI GEO: GSE97590.

Posted in Dengue, Resource

A 20-Gene Set Predictive of Progression to Severe Dengue

Methodology employed by Robinson et al., Cell Reports, 2019. The 20-gene set was used to distinguish between individuals with severe and mild dengue

The gene signatures predictive of severe dengue disease progression is poorly understood.

The study by Robinson et al., utilise 10 publicly available datasets and divided them into 7 “discovery” and 3 “validation” datasets. In the discovery datasets, a total of 59 differentially expressed genes (FDR < 10%, effect size > 1.3-fold) was detected between patients who progress to DHF and/or DSS (DHF/DSS) versus patients with an uncomplicated course (dengue fever).

An iterative greedy forward search to the 59 genes revealed a final set of and 20 differentially expressed genes (3 over-expressed, 17 under-expressed) in DHF/DSS (Gene list as shown in figure above). A dengue score for each sample was obtained by subtracting the geometric mean expression of the 17 under-expressed genes from the geometric mean expression of the 3 over-expressed genes.

The 20-gene dengue severity scores distinguished DHF/DSS from dengue fever upon presentation and prior to the onset of severe complications with a summary area under the curve (AUC) = 0.79 in the discovery datasets. The 20-gene dengue scores also accurately identified dengue patients who will develop DHF/DSS in all three validation datasets.

To further validate this signature, the authors tested a cohort of prospectively enrolled dengue patients in Colombia. The 20-gene dengue score, measured by qPCR, distinguished severe dengue from dengue with or without warning signs (AUC = 0.89) and even severe dengue from dengue with warning signs (AUC = 0.85).

Finally, the 20-gene set is significantly downregulated in natural killer (NK) and NK T (NKT) cells, indicating the role of NK and NKT cells in modulating severe disease outcome.

Dataset deposited under Gene Expression Omnibus (GEO): GSE124046

Posted in Dengue, Resource

Immunotranscriptomic profiling the acute and clearance phases of a human challenge dengue virus serotype 2 infection model

Differentially expressed genes at day 8 and 28 after rDEN2Δ30 infection. Source from Hanley JP et al., Nature Communications, 2021.

rDEN2Δ30 is a recombinant serotype 2 virus based on the American genotype 1974 Tonga DENV2 virus, which has been partially attenuated by deletion of 30 nucleotides in the 3′ untranslated region of the RNA genome (Δ30). rDEN2Δ30 infection is known to induce modest viremia in all flavivirus-naive subjects and a mild, transient non-pruritic rash in 80% of recipients.

rDEN2Δ30 infection could hence be a suitable model to evaluate molecular signatures responsible for asymptomatic or mild DENV-2 infection.

In this study by Hanley JP et al., RNA-seq was performed on whole blood collected from rDEN2Δ30-infected subjects at 0, 8, and 28 days post infection. rDEN2Δ30-induced reproducible but modest viremia and a mild rash as the only clinically significant finding in DENV-naive subjects.

Principal component analysis reveal minimal overlap between baseline (day 0) and peak viremia (day 8). The day 28 data (post viremia) partially overlapped with the baseline (day 0) and acute (day 8) timepoints. Pathways enriched in the type I and type II interferon and antiviral responses were upregulated at day 8, whereas pathways controlling translational initiation were downregulated. NF-κB, IL-17 signaling pathways, apoptosis, toll-like receptor signaling, response to viruses, ribosomes, and defense responses were also differentially regulated at day 28.

Myeloid cells including monocytes and activated dendritic cells were significantly increased during acute infection and returned to baseline. In contrast, regulatory T cells (Tregs) were significantly decreased during acute stage.

Gene ontology pathway analysis revealed that the viremia-tracking set of genes was enriched for both response to and regulation of type I and II interferon pathways, including JAK/STAT signaling. Genes encoding for proteins that directly inhibit viral genome replication and involved in protein ubiquitination and catabolism, especially ISG15 pathway, tracked with viremia. Day 28 revealed more varied pathways, including protein ubiquitination, cell migration, cytoskeletal reorganization, and angiogenesis.

Baseline transcript signatures can potentially predict whether the subjects would develop rash after rDEN2Δ30 infection. Higher baseline expression of myeloid nuclear differentiation antigen (MNDA), and cell surface associated cellular processes such as tetraspanin CD37, integral membrane 2B (ITM2B), and genes involved in autophagy (VMP1) was associated with protection from rash. These genes are mostly related to myeloid responses, membrane regulation, autophagy, K63 ubiquitination, and cell morphogenesis.

Transcriptomic signatures modulated by rDEN2Δ30 infection and severe dengue are distinct. Only one gene family, the guanine binding protein (GBP1/2) genes was differentially regulated in both severe dengue and during mild rDEN2Δ30 infection.

Data deposited im Gene Expression Omnibus under accession number GSE152255

Posted in Dengue, Resource

Increased adaptive immune responses and proper feedback regulation protect against clinical dengue

Genes related to antigen presentation were significantly increased in the asymptomatic compared to the symptomatic dengue individuals. Manuscript by E Simon-Lorière et al., Science Translational Medicine, 2017.

Dengue infections can be asymptomatic, symptomatic, or occasionally progress to severe dengue, a life-threatening condition characterised by a cytokine storm, vascular leakage, and shock. However, the molecular and immunological mechanisms underlying asymptomatic dengue virus (DENV) infection remains largely unknown.

In the publication, E Simon-Lorière et al recruited DENV infected children in Cambodia. Nine individuals remained strictly asymptomatic at the time of inclusion and during the 10-day follow-up period. PBMCs from 8 asymptomatic DENV-1 viremic individuals and 25 symptomatic dengue patients were used for further gene expression analysis.

Asymptomatic individuals have an increase in the percentage of CD4+ T cells and a decrease in CD8+ T cells compared to symptomatic dengue individiuals. However, CD14+ monocytes, Lin-CD11c+ dendritic cells, CD19+ B cells, and CD335+ natural killer cells are not significantly different between asymptomatic and symptomatic individuals.

Transcriptomic signatures were distinct between asymptomatic and symptomatic individuals. The top pathways that diverge the most between asymptomatic and clinical dengue individuals were related to immune processes. Notably, the transcriptomic differences cannot be explained by differences in viral load or immune status.

The innate immune responses were not significantly different between the asymptomatic and symptomatic individuals. Instead, the most significantly activated pathway in asymptomatic individuals was related to “nuclear factor of activated T-cells (NFAT) mediated regulation of immune response.” These genes include CIITA, CD74 and various human leukocyte antigen (HLA) genes, where their expression differences were also validated at the protein levels (See figure on top).

Protein kinase Cq (PKCq) signaling in T lymphocytes was also highly activated in asymptomatic viremic individuals. Genes upregulated included AKT3, SOS1, PAK1, and SLAMF6, as well as T-cell costimulatory pathways such as ICOS-ICOSL, and CD28 and CTLA4 signaling in cytotoxic T-cells.

In contrast, genes related to B-cell activation, differentiation and plasma cell development (BLIMP-1, IRF4) were downregulated in asymptomatic individuals. This finding is correlated with the reduction in antibody production in the asymptomatic individuals.

Data is saved in Gene Expression Omnibus under accession number GSE100299