Python Documentation

The Guide For Running Python Software on Nextjournal

1.
Getting Started

1.1.
Using a Template

The easiest way to get started is to head to https://nextjournal.com/ → click on Add a new article → select the Python Template.

The Nextjournal Python template

A new article will appear with appear with an instruction: Remix this to get started with Python nil and a code cell:

import sys; sys.version.split()[0]
  • import sys: the sys module provides information about the Python interpreter
  • sys.version: returns a string containing version information, split()[0] is concatenated to the string to parse out the version number.
  • nil: is the return value from the Nextjournal cell - in this case Python's version number. Black indicates the result was generated by the current runner. If it is grey, remix this article and press the play button on the cell to boot a Python runner and generate a fresh output. Every runner is allocated on a need-by-need basis and will eventually expire.

1.2.
Basics

The basic functionality of Python on Nextjournal is similar to Python in other contexts. The default environment has a variety of packages installed, including matplotlib, pandas, scipy, and plotly.

This code displays a sorted list of all packages that are installed in the Python nil environment.

import pkg_resources

plist = [[d.project_name,d.version] for d in iter(pkg_resources.working_set)]
print(sorted(plist,key=lambda l:l[0].lower()))

Since many popular packages are already installed, getting up and running is simple. For example, a quick plot of 500 random values using plotly and numpy can be accomplished in just a few lines.

import plotly.graph_objs as go
import numpy as np

x0 = np.random.randn(500)
x1 = np.random.randn(500)+1

trace1 = go.Histogram(x=x0, opacity=0.75)
trace2 = go.Histogram(x=x1, opacity=0.75)

layout = go.Layout(barmode='overlay')
go.Figure(data=[trace1, trace2], layout=layout)

Note the use of a Nextjournal references throughout this tutorial. The first was a reference to the Python version number,nil, and now a reference to the number of random values we wish to plot, 500. Read more about Nextjournal references to use them in your articles.

2.
Handling Data

2.1.
Displaying Output

How return values from Nextjournal cells are formatted depends on the type. Strings and simple variables are printed.

n = 100
n

More complicated structures like arrays and dictionaries will appear as an expandable tree. Take care to note this line in the following cell: t = 2*np.random.random(n).

  • n is defined in the previous cell as 100
  • np is defined earlier in this article as the numpy module: import numpy as np

Each cell shares the same Python runtime unless a second runtime is explicitly created.

t = 2*np.random.random(n)
s = np.sin(2 * np.pi * t)
s.tolist()

This cell's return value can be expanded for closer inspection of its elements.

2.2.
Adding Data

There are four ways to add data to a Nextjournal article:

  • Upload a file via the Insert Menu
  • Python methods to download to the filesystem
  • Linux tools in a Bash cell (e.g. curl or wget)
  • Add a private repository

2.2.1.
Inserting Files

2.2.1.1.
Uploading

Use Insert Menu File to upload a file. The result of uploading cubic.csv:

cubic.csv

Files uploaded using the Insert Menu are stored in a persistent, versioned database as a content addressed hash. Access a file in a language cell by inserting a reference from the @-menu, which appears when you type the @ symbol.

It's simple to open the named reference and operate on the data, just as you would any file in Python.

with open(cubic.csv, "r") as f:
  data = f.read()
  
[float(x) for x in data.split()]
2.2.1.2.
Downloading

Export results by writing or copying files to the /results directory. Executing the following code makes data available for download.

with open("/results/cubic-output.csv", "w") as f:
  f.write(data)
cubic-output.csv

/results is a powerful, shared file state in Nextjournal. For more information, refer to Understanding /results.

2.2.2.
Bash

TODO

3.
Plotting Data

Data can be graphically displayed in multiple ways on Nextjournal. Any graphical system that can output to an image file can be shown by saving the file to the /results directory. Under Python, matplotlib and Plotly are directly supported by the platform.

3.1.
Basic Plotting

3.1.1.
Matplotlib

Recall this code that we defined earlier in the article:

s is easily plotted using matplotlib. Nextjournal will detect a PyPlot figure object returned from a cell and display it as an SVG file. To display multiple plots from a single code cell, or to control the filetype, pyplot.savefig() can be used:

import matplotlib
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
plt.scatter(t, s)

ax.set(title="Sine Wave, 100 Random Points")
ax.grid()
plt.savefig("/results/sine.png")
for i in [1,2,3]:
  n = 5**i
  t = 2*np.random.random(n,)
  s = np.sin(2 * np.pi * t)

  fig, ax = plt.subplots()
  plt.scatter(t, s)

  ax.set(title="Sine Wave, %d Points"%n)
  ax.grid()
  plt.savefig("/results/%d.png"%i)

3.1.2.
Plotly

This syntax provides full access to the functionality of Plotly on Python.

1.6s
Language:Python
import plotly.plotly as py
import plotly.graph_objs as go

import urllib
import numpy as np

url = "https://raw.githubusercontent.com/plotly/datasets/master/spectral.csv"
f = urllib.request.urlopen(url)
spectra=np.loadtxt(f, delimiter=',')

traces = []
y_raw = spectra[:, 0] # wavelength
sample_size = spectra.shape[1]-1 
for i in range(1, sample_size):
    z_raw = spectra[:, i]
    x = []
    y = []
    z = []
    ci = int(255/sample_size*i) # ci = "color index"
    for j in range(0, len(z_raw)):
        z.append([z_raw[j], z_raw[j]])
        y.append([y_raw[j], y_raw[j]])
        x.append([i*2, i*2+1])
    traces.append(dict(
        z=z,
        x=x,
        y=y,
        colorscale=[ [i, 'rgb(%d,%d,255)'%(ci, ci)] for i in np.arange(0,1.1,0.1) ],
        showscale=False,
        type='surface',
    ))

layout = go.Layout(title='Ribbon Plot')
go.Figure(data=traces, layout=layout)

Notice that Plotly graphics are interactive in both the edit and published views. The plot above can be rotated by clicking, and zoomed with the mouse wheel.

3.2.
Plotting From Files

Earlier in this article, the data from cubic.csv was operated upon and written to cubic-output.csv. It is ready to be plotted.

from matplotlib.pyplot import *

with open(cubic-output.csv) as f:
  dataz = f.read()
dataz = [float(x) for x in dataz.split()]

plot(dataz)

title("Cubic Function"); xlabel("x"); ylabel("f(x)");
gcf()

3.3.
Plotting From Files: Permissions

In some cases, files will need an appropriate extension, or will need to be opened with write permissions. A local copy can be made with cp in a Bash cell, or equivalent tools in languages.

import shutil,os

shutil.copyfile(cubic-output.csv,"cubic.csv")

if os.path.isfile("cubic.csv"):
  print("Size:",os.path.getsize("cubic.csv"), "bytes.")

3.4.
Plotting Between Languages

Reactive references even work between different languages, including client-side Clojurescript and Javascript.

/* Referencing Python data in a Javascript cell! */

var data = nil
for (var i=1, xes=[]; i <= data.length; i++) xes.push(i)

var trace = [{ type: 'scatter', mode: 'lines', name: 'cubic',
              x: xes, y: data }]
var layout = { title: 'Cubic Function',
  xaxis: { title: 'x' }, yaxis: { titl: 'f(x)' }}

Nextjournal.plot(trace, layout)

4.
Installing Packages

Additional packages that are required for an article should be installed in the standard way using the pip command or other methods, and saved to a new environment. If an existing article has the setup you need, you can transclude its environment into your article, or remix it.

Each Nextjournal code cell runs in a runtime, and each runtime has an environment, which is a Docker container with its own filesystem. In any environment we can install whatever system or language packages we need, modify configuration files, and set up directory and data file structures however we like. Then, we can save and export the environment as a whole for future reproducibility, as well as use by others.

Let's configure an environment for mapmaking, with the geoplot package. We'll install packages in a runtime we name geoplot, and set it to use Nextjournal's default Python environment. This Python 3 environment—as well as its Python 2 counterpart—has a variety of packages installed, including numpy, matplotlib, and plotly:

pip freeze

4.1.
Conda Package Manager

The default Python environment on Nextjournal includes the conda package manager. The easiest way is to use conda, which will attempt to install all packages and dependencies in a consistent manner, including system packages and libraries. The Anaconda Cloud has a searchable database of packages and channels—by default we will select only from the anaconda channel.

conda install -y descartes pysal

If we need a different version or esoteric package, we can add other channels.

conda install -y -c conda-forge cartopy

4.2.
The Python Package Index

Not all packages are available in the conda package manager. The Python Package Index (PyPI) - for installing packages in wheel files, pip is available. For any packages that require compilation, we can install gcc first.

apt-get update > /dev/null
apt-get install -y gcc

pip install quilt

We can also use pip to install development versions off of github, though we have to install git first.

apt-get install -y git

pip install git+https://github.com/geopandas/geopandas

Finally, if a package has a setup.py, we can download and install with that.

git clone https://github.com/ResidentMario/geoplot
cd geoplot
python setup.py install

Once everything is set up to our liking, we can save and export the runtime's end state as a new environment using its configuration panel. Using the saved geoplot environment as our Main runtime's environment then ensures that the versions of programs and packages that the article is developed on will be preserved for future reproducibility, even through a remix. Once the article is published, our exported environment will also be available for other articles to use via transclusion.

Here's an example from the geoplot gallery:

quilt install ResidentMario/geoplot_data
Shift+Enter to run
Language:Python
# Load the data (uses the `quilt` package).
import geopandas as gpd
from quilt.data.ResidentMario import geoplot_data

continental_cities = gpd.read_file(
  geoplot_data.usa_cities()).query('POP_2010 > 100000')
continental_usa = gpd.read_file(geoplot_data.contiguous_usa())


# Plot the figure.
import geoplot as gplt
import geoplot.crs as gcrs
import matplotlib.pyplot as plt

poly_kwargs = {'linewidth': 0.5, 'edgecolor': 'gray', 'zorder': -1}
point_kwargs = {'linewidth': 0.5, 'edgecolor': 'black', 'alpha': 1}
legend_kwargs = {'bbox_to_anchor': (0.9, 0.9), 'frameon': False}

ax = gplt.polyplot(continental_usa,
                   projection=gcrs.AlbersEqualArea(central_longitude=-98, 
                                                   central_latitude=39.5),
                   **poly_kwargs)

gplt.pointplot(continental_cities, projection=gcrs.AlbersEqualArea(), ax=ax,
               scale='POP_2010', limits=(1, 80),
               hue='POP_2010', cmap='Blues',
               legend=True, legend_var='scale',
               legend_values=[8000000, 6000000, 4000000, 2000000, 100000],
               legend_labels=['8 million', '6 million', '4 million', 
                              '2 million', '100 thousand'],
               legend_kwargs=legend_kwargs,
               **point_kwargs)

plt.title("Large cities in the contiguous United States, 2010")
plt.savefig("/results/map.svg", bbox_inches='tight', pad_inches=0.1)
© 2018 Nextjournal GmbH