Interactive circle packing plots

Introduction

Here is a fun little tidbit that I came across over the weekend. :)

I was looking for something suited for visualizing hierarchical categorical data that goes beyond the regular bar graphs. This D3 zoomable circle packing visualization, done using the circlepackeR package, uses a series of nested circles that you can click on and zoom in/out of.

To learn more, please see the official documentation by the package author.

Import and pre-process data

As usual, we will use the IBM Telco customer churn dataset, which I have cleaned up in a previous post.

Since I'm quite a bit more comfortable with data wrangling in Python, I will first get the number of customers in each level of every categorical variable using pandas:

## Import data
import pandas as pd

df = pd.read_csv("https://github.com/nchelaru/data-prep/raw/master/telco_cleaned_renamed.csv")

## Get categorical column names
cat_list = [] 

for col in df.columns:
  if df[col].dtype == object:
    cat_list.append(col)
    
## Get all possible levels of every categorical variable and number of data points in each level
cat_levels = {}

for col in cat_list:
  levels = df[col].value_counts().to_dict()
  cat_levels[col] = levels
  
## Convert nested dictionary to dataframe
nestdict = pd.DataFrame(cat_levels).stack().reset_index()

nestdict.columns = ['Level', 'Category', 'Population'] 

## Output data to file
nestdict.to_csv("./results/nested_dict.csv")

## Preview dataframe
nestdict.head()
LevelCategoryPopulation
0Bank transfer (automatic)PaymentMethod1542.0
1ChurnChurn1869.0
2Credit card (automatic)PaymentMethod1521.0
3DSLInternetService2416.0
4DependentsDependents2099.0
5 items
nested_dict.csv

Create circle packing visualization

Now we will take the prepared data and move to R for making the plot:

## Import libraries
library(tidyverse)
library(circlepackeR)  
library(hrbrthemes)
library(htmlwidgets)
library(data.tree)

## Import data
nestdict <- read.csv(
nested_dict.csv
) ## Prepare data format nestdict$pathString <- paste("world", nestdict$Category, nestdict$Level, sep = "/") population <- as.Node(nestdict) ## Make the plot x <- circlepackeR(population, size = "Population", color_min = "hsl(56,80%,80%)", color_max = "hsl(341,30%,40%)") ## Save widget to HTML file for display saveWidget(x, 'widget.html')

Finally, move the HTML file to the results folder so we can visualize it:

mv widget.html ./results

At a glance, the sizes of circles in the second level give a quick overview of relative distributions of the levels of each categorical variable. Click on the circles to zoom in and out!

When the occasion is right, this could be a really fun way to add some pizzazz to your visualizations. :)

If you want to try this yourself, click on "Remix" in the upper right corner to get a copy of the notebook in your own workspace. Please remember to import both the Python (circle_packing_Python) and R (circle_packing_R) runtimes from this notebook (intelrefinery/interactive-circle-packing-plots) under "Runtime Settings" to ensure that you have all the installed packages and can start right away.