Plotting in Nextjournal

Data visualization is at the core of Nextjournal. With no installation required, Nextjournal allows you to create beautiful, interactive graphics across multiple langauges within the same article.

library(plotly)

# plot_ly(z = ~volcano, type = "surface")

d <- diamonds[sample(nrow(diamonds), 1000), ]
plot_ly(
 d, x = ~carat, y = ~price,
 # Hover text:
 text = ~paste("Price: ", price, '$<br>Cut:', cut),
 color = ~carat, size = ~carat, type='scatter', mode='markers')
list(visdat = list(`13d344....

Overview

Basic Plotting

A graph can be created and displayed with a single function. In this case, the Nextjournal default R environment is being used.

hist(rnorm(200, 1), breaks=100, main = "Random Deviates of a Normal Distribution", xlab = "Events")
c(-1.9, -1.85, -1.8, -1.75....

The output of the previous cell can be referenced in a Bash cell. The file command provides more insight into the nature of the visualization:

file -b 
__out-0

The output, called , is an svg file. Nextjournal can display other datatypes:

png('/results/my-plot.png')
c(-1.4, -1.35, -1.3, -1.25....
file -b 
my-plot.png

Works as expected. is an svg file and is a png file.

Take note of png('/results/my-plot.png') in the code cell above. Nextjournal will attempt to work with any file added to the /results directory. This may be uploaded data or images or, as in this case, the result of a calculation. For more information on /results, refer to Understanding Results.

Working With Data

There are several ways to work with data on Nextjournal. You can

  • Use the ➕ insert menu or the ··· action menu and select File to upload data
  • Use an external repository like Github or S3
  • Use command line tools like wget
  • Use output generated from code cells

This example will use the first option. Both the ➕ insert menu and the ··· action menu are exposed when selecting or hovering over article elements like paragraphs or code cells.

artist_data.csv
artwork_data.csv

The data above reflects the Tate's permanent collection. Every artist is represented in artist_data.csv; every individual work of art is enumerated in artwork_data.csv.

Nextjournal Plotting Defaults

Default Nextjournal environments offer interactive plotly.js graphs for R (plot_ly and ggplotly), Python (plotly.py), and Julia (plotly.jl).

Plotly is language agnostic. Rather than outputting static images, Plotly libraries for R, Python, and Julia will send serialized JSON data to Plotly for rendering.

  • R also supports static plots with ggplot2 and the default R graphics package
  • Python also supports static plots using the matplotlib library
  • Julia also supports static plots with gr using the Plots visualization interface and toolset.

R

The Nextjournal R Environment provides R version . For plotting, Nextjournal supports the default R graphics package (graphics), plot_ly(), ggplot2(), and ggplotly() with no additional installation required.

The Default R Graphics Package

This first example uses the standard smoothScatter() function to plot the birth year of artists represented in the Tate Museum's permanent collection. Note that smoothScatter() does not require the loading of any dependencies.

artists <- read.csv(
artist_data.csv
, header=T) born <- artists$yearOfBirth birth_distribution = smoothScatter(born, 1:length(born), axes=FALSE, xlab="Year", ylab="", main="Distribution of Artist's Birth Years at the Tate") axis(1, col.ticks="blue") birth_distribution
artists

Working With Dependencies

Plotly and ggplot2 are external dependencies that offer more features than the default R graphics.

Load the tidyverse collection of R packages, which includes two dependencies used in the upcoming sections, ggplot2 (ggplot()) and readr (read_csv()). The Plotly package provides two important plotting functions, plot_ly() and ggplotly().

library(tidyverse)
library(plotly)
forcats, stringr, dplyr, p....

ggplot2

ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics.

artists <- read_csv(
artist_data.csv
) born <- artists$yearOfBirth df <- data.frame(born) ggplot(df, mapping=aes(x = born, y = as.numeric(row.names(df)))) + geom_point(size=2.2, alpha=0.4, shape=15) + labs(x = "Year", y=element_blank(), title = "Distribution of Artist's Birth Years at the Tate", subtitle = "From the Museum's Permanent Collection") + theme_bw() + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), panel.grid.minor=element_blank(), panel.grid.major.y=element_blank())
list(born = c(1930, 1852, ....

Plotly

The plot_ly() function transforms data into a Plotly object to enable interactive graphics and advanced plotting features.

This histogram compares the acquisition of male artists of a certain age versus female artists. The interactive features of Plotly in R is useful because the datasets overlap. Turning off the male histogram gives a better sense of the growth of female acquisition; turning on the male histogram shows how far institutions have yet to go.

artists <- read_csv(
artist_data.csv
) female_artists <- artists[artists$gender == "Female",] male_artists <- artists[artists$gender == "Male",] plot_ly(alpha=0.6) %>% add_histogram(data=female_artists, x=~yearOfBirth, name="Females") %>% add_histogram(data=male_artists, x=~yearOfBirth, name="Males") %>% layout(barmode="overlay", xaxis=list(title="Year of Birth"))
list(visdat = list(`18a832....

ggplotly

The ggplotly() function transforms a static ggplot object into a Plotly object. More detailed information is available at the Plotly ggplot2 Library documentation.

artists <- read_csv(
artist_data.csv
) born <- artists$yearOfBirth df <- data.frame(born) id <- as.numeric(row.names(df)) ggplotly(ggplot(df, mapping=aes(x = born, y = id)) + geom_point(size=1.5, alpha=0.4, shape=15) + labs(x = "Year", y="", title = "Distribution of Artist's Birth Years at the Tate") + theme_bw() + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), panel.grid.minor=element_blank(), panel.grid.major.y=element_blank()))
list(data = list(list(x = ....

Multiple Plots

A Nextjournal cell can show multiple graphs—the runner will detect each new figure automatically and display them in order.

artworks <- read_csv(
artwork_data.csv
) drop <- c("accession_number", "artistRole", "artistId", "dateText", "creditLine", "units", "inscription", "thumbnailCopyright", "thumbnailUrl", "url") artworks_rem <- artworks[ , !(names(artworks) %in% drop)] artworks_size <- artworks_rem[!(is.na(artworks_rem$height & artworks_rem$width & artworks_rem$year)), ] artworks_size$size <- artworks_size$height * artworks_size$width metal <- artworks_size[artworks_size$medium == "Steel" | artworks_size$medium=="Bronze",] plot_ly(data=metal, x=~acquisitionYear, name="Sculptural Acquisitions") plot_ly(data=metal, x=~year, y=~acquisitionYear, z=~size, color=~medium, colors = c('#BF382A', '#0C4B8E'), text=~artist, marker=list(size=4, opacity=0.5)) %>% add_markers() %>% layout(scene = list(xaxis = list(title = 'Year Created'), yaxis = list(title = 'Year of Acquisition'), zaxis = list(title = 'Size')), annotations = list( x = 1.13, y = 1.05, text = 'Material', xref = 'paper', yref = 'paper', showarrow = FALSE ))
list(visdat = list(`119665....

Python

The Nextjournal Python Environment provides Python 3 support at version and Python 2 support at version .

Matplotlib

Matplotlib is a plotting library for Python with at MATLAB-like interface.

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

artwork_data = pd.read_csv(
artwork_data.csv
) artwork_data.drop(columns=["accession_number", "artistRole", "artistId", "dateText", "acquisitionYear", "dimensions", "width", "height", "depth", "creditLine", "units", "inscription", "thumbnailCopyright", "thumbnailUrl", "url"]) # Drop the rows listed as NaN, otherwise indexing oil, acrylic, and watercolour artworks yeild the error "ValueError: cannot index with vector containing NA / NaN values." Replace this line with something more sensible to get a more complete dataset. artwork_data.dropna(subset=['medium'],inplace=True) artwork_data["year"] = pd.to_numeric(artwork_data["year"], errors="coerce") oil=artwork_data[artwork_data["medium"].str.contains("oil", case=False)] acrylic=artwork_data[artwork_data["medium"].str.contains("acrylic", case=False)] watercolour=artwork_data[artwork_data["medium"].str.contains("watercolour", case=False)] fig, ax = plt.subplots() ax.set(xlabel='year', ylabel='number of works', title='Paintings at the Tate, by Medium') ax.hist([oil["year"], acrylic["year"], watercolour["year"]], stacked=True) fig

Plotly

Plotly's Python graphing library creates interactive, publication-quality graphs online.

First, display the information as a table using Plotly's Figure Factory module.

import pandas as pd

# plotly imports
import plotly.plotly as py
import plotly.figure_factory as ff
# plotly.graph_objs contains all the helper classes to make/style plots
import plotly.graph_objs as go

artist_data = pd.read_csv(
artist_data.csv
) # Display the first 12 rows and 3 columns of the dataframe ff.create_table(artist_data.iloc[:12,:3], index=False)

Plot two histograms that compare the number of male artists in the Tate collection as compared to the number of female artists, distributed by their year of birth.

import numpy as np

artist_data = pd.read_csv(
artist_data.csv
) male = artist_data['gender'] == 'Male' female = artist_data['gender'] == 'Female' trace1 = go.Histogram( x=np.array((artist_data[female]['yearOfBirth'])), name='Female') trace2 = go.Histogram( x=np.array((artist_data[male]['yearOfBirth'])), name='Male') trace_data = [trace1, trace2] layout = go.Layout( bargroupgap=0.3) go.Figure(data=trace_data, layout=layout)

Note that the data points can be hovered over to view the data for each, both here and in the published view. Traces can also be toggled on and off by clicking in the legend.

For more examples and details about this library, please refer to the official Plotly Python Open Source Graphing Library documentation.

Julia

The Nextjournal Julia Environment provides support for version .

Plots offers the most flexible way to visualize data using Julia in Nextjournal. This preinstalled library provides a unified interface to different plotting libraries, including plotly and gr. Plotly graphs are interactive, while gr is faster for large data sets.

While documentation exists for both the Plotly Julia Library and Julia Package GR, these examples leverage plots, as such the Plots documentation will offer the most useful supplementary information.

Plotly

using Plots; plotly()
scatter(rand(10), rand(10), title="Plot.ly Backend")

The Plotly Julia Library offers more documentation examples for reference.

gr

using Plots; gr()

gr produces a png file which is displayed by Nextjournal.

scatter(rand(10), rand(10), title="GR Backend")