Plotting in Nextjournal

Data visualization is at the core of Nextjournal. With no installation required, Nextjournal allows you to create beautiful, interactive graphics across multiple langauges within the same article.

Overview

Basic Plotting

A graph can be created and displayed with a single function. In this case, the Nextjournal default R environment is being used.

hist(rnorm(200, 1), breaks=100, main = "Random Deviates of a Normal Distribution", xlab = "Events")

The output of the previous cell can be referenced in a Bash cell. The file command provides more insight into the nature of the visualization:

file -b Imagehistogram↩ 

The output, called , is an svg file. Nextjournal can display other datatypes:

png('/results/my-plot.png')
file -b Imagepng-output↩

Works as expected. is an svg file and is a png file.

Take note of png('/results/my-plot.png') in the code cell above. Nextjournal will attempt to work with any file added to the /results directory. This may be uploaded data or images or, as in this case, the result of a calculation. For more information on /results, refer to Understanding Results.

Working With Data

There are several ways to work with data on Nextjournal. You can

  • Use the βž• insert menu or the Β·Β·Β· action menu and select File to upload data
  • Use an external repository like Github or S3
  • Use command line tools like wget
  • Use output generated from code cells

This example will use the first option. Both the βž• insert menu and the Β·Β·Β· action menu are exposed when selecting or hovering over article elements like paragraphs or code cells.

artist_data.csv
artwork_data.csv

The data above reflects the Tate's permanent collection. Every artist is represented in artist_data.csv; every individual work of art is enumerated in artwork_data.csv.

R

The Nextjournal R Environment provides R version . For plotting, Nextjournal supports the default R graphics package (graphics), plotly, and ggplot2 with no additional installation required.

The Default R Graphics Package

This first example uses graphics' smoothScatter() function to plot the birth year of artists represented in the Tate Museum's permanent collection. Note that graphics does not require the loading of any dependencies.

artists <- read.csv(artist_data.csv↩, header=T)
born <- artists$yearOfBirth

birth_distribution = smoothScatter(born, 1:length(born), axes=FALSE,
                                   xlab="Year", ylab="",
									        				 main="Distribution of Artist's Birth Years at the Tate")                            

axis(1, col.ticks="blue")
birth_distribution

Working With Dependencies

plotly and ggplot2 are external dependencies that offer more features than the default R graphics.

Load the tidyverse collection of R packages, which includes two dependencies used in the upcoming sections, ggplot2 and readr. The plotly package provides two important plotting functions, plot_ly() and ggplotly().

library(tidyverse)
library(plotly)

ggplot2

ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics.

artists <- read_csv(artist_data.csv↩)
born <- artists$yearOfBirth
df <- data.frame(born)

ggplot(df, mapping=aes(x = born, y = as.numeric(row.names(df)))) + 
           geom_point(size=2.2, alpha=0.4, shape=15) + 
           labs(x = "Year", y=element_blank(),
                title = "Distribution of Artist's Birth Years at the Tate", 
                subtitle = "From the Museum's Permanent Collection") + 
           theme_bw() +
           theme(axis.text.y = element_blank(),
                 axis.ticks.y = element_blank(),
                 panel.grid.minor=element_blank(),
                 panel.grid.major.y=element_blank())

plotly

plotly on R enables interactive graphics and offers advanced plotting features.

This histogram compares the acquisition of male artists of a certain age versus female artists. The interactivity plotly offers is especially important here because the data entirely overlaps. Turning off the male histogram gives a better sense of the growth of female acquisition; turning on the male histogram shows how far institutions have yet to go.

artists <- read_csv(artist_data.csv↩)
female_artists <- artists[artists$gender == "Female",]
male_artists <- artists[artists$gender == "Male",]

plot_ly(alpha=0.6) %>%
  add_histogram(data=female_artists, x=~yearOfBirth, name="Females") %>%
  add_histogram(data=male_artists, x=~yearOfBirth, name="Males") %>%
  layout(barmode="overlay", xaxis=list(title="Year of Birth"))

ggplotly

ggplot2 generates static plots; however, if the plotly package is loaded then you can convert your ggplots into plotly ones via the ggplotly() function. In this way you can gain some of plotly's interactive functionality.

More detailed information is available at the Plotly ggplot2 Library documentation.

artists <- read_csv(artist_data.csv↩)
born <- artists$yearOfBirth
df <- data.frame(born)
id <- as.numeric(row.names(df))

ggplotly(ggplot(df, mapping=aes(x = born, y = id)) + 
           geom_point(size=1.5, alpha=0.4, shape=15) +
           labs(x = "Year", y="",
                title = "Distribution of Artist's Birth Years at the Tate") +
           theme_bw() +
           theme(axis.text.y = element_blank(),
                 axis.ticks.y = element_blank(),
                 panel.grid.minor=element_blank(),
                 panel.grid.major.y=element_blank()))

Multiple Plots

A Nextjournal cell can show multiple graphsβ€”the runner will detect each new figure automatically and display them in order.

artworks <- read_csv(artwork_data.csv↩)

drop <- c("accession_number", "artistRole", "artistId", "dateText", "creditLine", "units", "inscription", "thumbnailCopyright", "thumbnailUrl", "url")
artworks_rem <- artworks[ , !(names(artworks) %in% drop)]

artworks_size <- artworks_rem[!(is.na(artworks_rem$height & artworks_rem$width & artworks_rem$year)), ]

artworks_size$size <- artworks_size$height * artworks_size$width

metal <- artworks_size[artworks_size$medium == "Steel" | artworks_size$medium=="Bronze",]

plot_ly(data=metal, x=~acquisitionYear, name="Sculptural Acquisitions")

plot_ly(data=metal, x=~year, y=~acquisitionYear, z=~size, color=~medium, 
        colors = c('#BF382A', '#0C4B8E'), text=~artist,
        marker=list(size=4, opacity=0.5)) %>%
  add_markers() %>%
  layout(scene = list(xaxis = list(title = 'Year Created'),
                     yaxis = list(title = 'Year of Acquisition'),
                     zaxis = list(title = 'Size')),
         annotations = list(
           x = 1.13,
           y = 1.05,
           text = 'Material',
           xref = 'paper',
           yref = 'paper',
           showarrow = FALSE
         ))

Python

The Nextjournal Python Environment provides Python 3 support at version and Python 2 support at version .

matplotlib

matplotlib is a library for making 2D plots of arrays in Python.

import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

artwork_data = pd.read_csv(artwork_data.csv↩)
artwork_data.drop(columns=["accession_number", "artistRole", "artistId", "dateText", "acquisitionYear", "dimensions", "width", "height", "depth", "creditLine", "units", "inscription", "thumbnailCopyright", "thumbnailUrl", "url"])

# Drop the rows listed as NaN, otherwise indexing oil, acrylic, and watercolour artworks yeild the error "ValueError: cannot index with vector containing NA / NaN values." Replace this line with something more sensible to get a more complete dataset.
artwork_data.dropna(subset=['medium'],inplace=True)
artwork_data["year"] = pd.to_numeric(artwork_data["year"], errors="coerce")

oil=artwork_data[artwork_data["medium"].str.contains("oil", case=False)]
acrylic=artwork_data[artwork_data["medium"].str.contains("acrylic", case=False)]
watercolour=artwork_data[artwork_data["medium"].str.contains("watercolour", case=False)]

fig, ax = plt.subplots()

ax.set(xlabel='year', ylabel='number of works',
       title='Paintings at the Tate, by Medium')

ax.hist([oil["year"], acrylic["year"], watercolour["year"]], stacked=True)
fig

plotly

plotly's Python graphing library wraps matplotlib to create interactive, publication-quality graphs online.

First, display the information as a table using plotly's Figure Factory module.

import pandas as pd

# plotly imports
import plotly.plotly as py
import plotly.figure_factory as ff
# plotly.graph_objs contains all the helper classes to make/style plots
import plotly.graph_objs as go

artist_data = pd.read_csv(artist_data.csv↩)

# Display the first 12 rows and 3 columns of the dataframe
ff.create_table(artist_data.iloc[:12,:3], index=False)

Plot two histograms that compare the number of male artists in the Tate collection as compared to the number of female artists, distributed by their year of birth.

import numpy as np

artist_data = pd.read_csv(artist_data.csv↩)

male = artist_data['gender'] == 'Male'
female = artist_data['gender'] == 'Female'

trace1 = go.Histogram(
    x=np.array((artist_data[female]['yearOfBirth'])),
    name='Female')

trace2 = go.Histogram(
    x=np.array((artist_data[male]['yearOfBirth'])),
    name='Male')

trace_data = [trace1, trace2]
layout = go.Layout(
    bargroupgap=0.3)

go.Figure(data=trace_data, layout=layout)

Note that the data points can be hovered over to view the data for each, both here and in the published view. Traces can also be toggled on and off by clicking in the legend.

For more examples and details about this library, please refer to the official Plotly Python Open Source Graphing Library documentation.

Julia

The Nextjournal Julia Environment provides support for version .

plots offers the most flexible way to visualize data using Julia in Nextjournal. This preinstalled library provides a unified interface to different plotting libraries, including plotly and gr. plotly graphs are interactive, while gr is faster for large data sets.

While documentation exists for both the Plotly Julia Library and Julia Package GR, these examples leverage plots, as such the Plots documentation will offer the most useful supplementary information.

plotly

using Plots; plotly()
scatter(rand(10), rand(10), title="Plot.ly Backend")

The Plotly Julia Library offers more documentation examples for reference.

gr

using Plots; gr()

gr produces a png file which is displayed by Nextjournal.

scatter(rand(10), rand(10), title="GR Backend")
Β© 2018 Nextjournal GmbH