Posted in Data visualisation, python

Making interactive and executable notebooks/dashboards with Lux and Voilà packages

As described in my previous blog posts, pair plots and correlation matrices are useful to quickly analyse relationships between multiple variables. However, when analysing the datasets with more than 10-20 variables, these analysis methods have their limitations. One critical limitation is the difficulty in assessing the effects of 2 or more categorical variables. Another is the difficulty in visualising a large number of variables plotted altogether in one plot.

Here, I propose two python packages that will make graph plotting less tedious. First is the Voilà package, that supports Jupyter interactive widgets and second is the Lux package, which is integrated with a Jupyter interactive widget that allows users to quickly browse through large collections of data directly within their Jupyter notebooks. These packages are useful in displaying data interactively, which makes them highly versatile for fast experimentation with data. I will share the features of these 2 packages as I run through the procedures for running these packages:

First, we need to install the packages. To install Voilà, execute the following command:

pip install voila

Note that this command will also be automatically be installed as an extension if you have version 3 of JupyterLab installed:

Next, to install Lux, enter the following command in terminal:

pip install lux-api

With these packages, you are now ready to use these applications. We will import all the necessary packages and use the iris dataset from PlotlyExpress as a proof of concept.:

import csv
import lux
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
df = px.data.iris()

The output file is as follows:

sepal_lengthsepal_widthpetal_lengthpetal_widthspeciesspecies_id
05.13.51.40.2setosa1
14.93.01.40.2setosa1
24.73.21.30.2setosa1
34.63.11.50.2setosa1
45.03.61.40.2setosa1

Next, launch Voilà in the Jupyter Notebook extension (indicated in arrow):

This should allow you to toggle between Pandas and Lux freely (indicated in arrow):

The Lux package automatically performs a pairwise plot against all variables in your dataframe, allowing you to focus on the scatterplots that show correlated relationship. Under the “Distribution” tab, you can quickly visualise the distribution of the data:

For the correlation graph, you can click onto the plots that you are interested in, and click the magnifying glass which indicates “intent.” This will then zoom into the graph further, allowing you to see how the other variables (in this case, species), interact with your scatterplot. This immediately provides you with a 3-dimensional view of the dataset.

As you can begin to appreciate, the Voilà package is a powerful tool to plot interactive widgets directly within your local address. You can even publish it online, allowing other users to use the interactive widgets to explore the dataset according to their research intent. Combined with the Lux package, this allows users, even those without bioinformatics background, to zoom into the interesting scatterplots for further analysis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s