Python Environments
This notebook describes and creates the default Python 2 & 3 environments in Nextjournal. Check out the showcase if you want to see what the environment contains. To see how it’s built, see setup.
Showcase
The Python 3 environment runs Python
pip freezeSystem Packages and Basics
A wide variety of support libraries are installed, as well as gcc v7.
Python packages are installed using conda, or pip version setuptools version
Plotting
The default environment comes with plotly version matplotlib version
Plotly
Plot a histogram using Plotly, a plotting library for making interactive graphs online.
import plotly.graph_objs as goimport numpy as npx0 = np.random.randn(500)x1 = np.random.randn(500)+1trace1 = go.Histogram(x=x0, opacity=0.75)trace2 = go.Histogram(x=x1, opacity=0.75)layout = go.Layout(barmode='overlay')go.Figure(data=[trace1, trace2], layout=layout)Matplotlib
Plot a 5 hertz sine wave using matplotlib, a Python plotting library.
import matplotlib.pyplot as plt, numpy as np# Data for plottingt = np.arange(0.0, 2.0, 0.01)s = 1 + np.sin((5 * 2) * np.pi * t)# Note that using plt.subplots below is equivalent to using# fig = plt.figure() and then ax = fig.add_subplot(111)_, ax = plt.subplots()ax.plot(t, s)ax.set(xlabel='time (s)', ylabel='voltage (mV)', title='Sine Wave')ax.grid()plt.show()Data Structures
Nextjournal's default Python environment contains several packages for data manipulation and parsing.
The SciPy ecosystem is available, including
scipyversion,numpy, andpandas.simplejsonmakes it easy to encode/decode JSON data structures.sixis included to help smooth differences between Python 2 and 3.
Numpy
Numpy's main object is a N-dimensional array useful for linear algebra, Fourier transforms, and random number capabilities. Here it is used to create a Mandelbrot set which is ultimately plotted using matplotlib.
import numpy as np, matplotlib.pyplot as pltdef mandelbrot( h,w, maxit=10): y,x = np.ogrid[ -1.4:1.4:h*1j, -2:0.8:w*1j ] c = x+y*1j z = c divtime = maxit + np.zeros(z.shape, dtype=int) for i in range(maxit): z = z**2 + c diverge = z * np.conj(z) > 2**2 # who is diverging div_now = diverge & (divtime==maxit) # who is diverging now divtime[div_now] = i + 100 # note when z[diverge] = 2 # avoid diverging too much return divtimeplt.subplots(1,figsize=(20,20))plt.imshow(mandelbrot(1000,1000)) plt.axis('off')plt.show()Pandas
Pandas makes data analysis easier in Python. For example, a single instantiation of pandas' Series class can include all label and data information. 1000 random values are generated by numpy and the final graph is plotted with matplotlib.
import pandas as pd, matplotlib.pyplot as plt, numpy as npts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))ts = ts.cumsum()_, ax = plt.subplots()ax = ts.plot()plt.show()Simplejson
Import and export JSON on Nextjournal using simplejson. In the example below, a Python data structure input results in JSON output—the change from None to null is a clear indicator.
import simplejson as jsonjson.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])Six
Six makes it easy to write Python code that is compatible with both Python 2 and Python 3.
For example, Python 2's urllib, urllib2, and urlparse modules have been combined in the urllib package in Python 3. The six.moves.urllib package is a version-independent location for this functionality.
Python 2:
from __future__ import print_functionfrom six.moves.urllib.request import urlopenurl = urlopen("http://nextjournal.com")print(url.read())Python 3:
from __future__ import print_functionfrom six.moves.urllib.request import urlopenurl = urlopen("http://nextjournal.com")print(url.read())Data Storage
Apache Arrow
import numpy as npimport pandas as pdimport pyarrow as pa# Converting Pandas Dataframe to Apache Arrow Tabledf = pd.DataFrame({"one": [20, np.nan, 2.5], "two": ["january", "february", "march"], "three": [True, False, True]},index=list("abc"))table = pa.Table.from_pandas(df)# Writing a parquet file from Apache Arrowimport pyarrow.parquet as pqpq.write_table(table, "/shared/example.parquet")# Reading a parquet filetable2 = pq.read_table("/shared/example.parquet")# Reading a parquet filedf_new = table2.to_pandas()df_new == dfSetup
Build a Minimal Python 3 Environment
Download and install conda.
CONDA_VER="4.8.3"PYTHON_VER="py37"file="Miniconda3-${PYTHON_VER}_${CONDA_VER}-Linux-x86_64.sh"wget -q --show-progress --progress=bar:force -P /results \ https://repo.continuum.io/miniconda/${file}bash Miniconda3-py37_4.8.3-Linux-x86_64.sh -b -p /opt/condaLinks to make sure conda Python supersedes system Python for non-absolute, non-versioned calls.
ln -s /opt/conda/bin/pip /opt/conda/bin/pip3ln -s /opt/conda/bin/pip /opt/conda/bin/pip3.7ln -s /opt/conda/bin/python3.7 /opt/conda/bin/python3mln -s /opt/conda/bin/python3.7m-config /opt/conda/bin/python3m-configAdd conda's library directory so ldconfig will pick it up, set conda config, and ensure pip is reasonably updated. We also pin Python to the installed minor version, allowing only patch-version up/downgrades.
# make this the last alphabetically => lowest precedence librariesecho "/opt/conda/lib" >> /etc/ld.so.conf.d/zz-conda.confmkdir ~/.conda/pkgs # prevent a warningconda config --set always_yes Truepip_ver=$(pip --version | sed 's/pip \(.*\) from.*/\1/')echo "pip >=$pip_ver" > /opt/conda/conda-meta/pinned # prevent pip downgrade# upgrade Python within minor versionpython_minor=$(python --version | sed 's/Python \(.*\)\..*/\1/')echo "python =$python_minor" >> /opt/conda/conda-meta/pinnedconda update python pipconda update -yn base condaconda clean -qtipyldconfigpython -Vpip -VPackage up the installation for use in other environments.
du -hsx /tar -zcPf /results/minimal-python3.tgz /opt/condaBuild the Default Python 3 Environment
Install
Just need a few system libraries, particularly for HDF5 support.
apt-get -qq updateDEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends \ libxext6 libhdf5-100apt-get cleanrm -r /var/lib/apt/lists/*This default image has support for a number of general-use packages, including pandas, scipy, scikit-learn, scikit-image, and opencv-python. For graphical output, matplotlib and plotly are installed. We'll also install some basic utilities, as well as setuptools to make any additional installs less difficult. We're installing Jedi to have code completions for Python, and Jupyter to support notebook imports.
conda install -c plotly \ setuptools six simplejson dill pillow pytables h5py \ plotly matplotlib tqdm termcolor tabulate \ python-dateutil more-itertools toolz cython cffi attrs decorator jedi \ numpy scipy patsy statsmodels pandas pandas-datareader seaborn \ scikit-learn scikit-image \ jupyterconda clean -qtipyldconfig# make sure jupyter components are up-to-date# also add non-anaconda-main packages here (conda-forge packages can be broken)pip install --upgrade altair pandas pyarrow feather-format \ pipenv jupyter-client jupyter-corepython -Vpip -Vjupyter --versionjupyter kernelspec listjupyter --pathsAnd we'll install the unofficial wheel of OpenCV.
pip install opencv-python-headlessPre-import packages to speed up cold boot time.
PI_PKGS="altair, backcall, bleach, certifi, cffi, chardet, cloudpickle, conda, conda_package_handling, cryptography, cycler, cython, cytoolz, dask, decorator, defusedxml, dill, entrypoints, feather, h5py, idna, imageio, importlib_metadata, ipykernel, ipython_genutils, ipywidgets, jedi, jinja2, joblib, jsonschema, jupyter, jupyter_client, jupyter_console, jupyter_core, kiwisolver, lxml, markupsafe, matplotlib, mistune, mkl_fft, mkl_random, mock, more_itertools, nbconvert, nbformat, networkx, notebook, numexpr, numpy, olefile, cv2, pandas, pandas_datareader, pandocfilters, parso, patsy, pexpect, pickleshare, PIL, pipenv, plotly, prometheus_client, prompt_toolkit, ptyprocess, pyarrow, pycosat, pycparser, pygments, OpenSSL, pyparsing, pyrsistent, socks, dateutil, pytz, pywt, zmq, qtconsole, requests, retrying, ruamel_yaml, skimage, sklearn, scipy, seaborn, send2trash, simplejson, six, statsmodels, tables, tabulate, termcolor, terminado, testpath, toolz, tornado, tqdm, traitlets, urllib3, virtualenv, wcwidth, webencodings, widgetsnbextension, zipp, ipykernel.pylab.backend_inline"python -c "import $PI_PKGS"pkgs=$(echo $PI_PKGS | sed 's/,//g')for pkg in $pkgs; do python -c "from $pkg import *"doneFinally, set up default fonts for matplotlib.
mkdir -p ~/.config/matplotlib/echo 'font.family: sans-seriffont.sans-serif: Fira Sans, PT Sans, Open Sans, Roboto, DejaVu Sans, Liberation Sans, sans-seriffont.serif: PT Serif, Noto Serif, DejaVu Serif, Liberation Serif, seriffont.monospace: Fira Mono, Roboto Mono, DejaVu Sans Mono, Liberation Mono, Fixed, Terminal, monospace' > ~/.config/matplotlib/matplotlibrcCheck size and final tests.
python -Vpip -Vconda -Vdu -hsx /Incremental Additions
pip install vega_datasetsTest
python --versionjupyter kernelspec listjupyter --versionjupyter --pathsimport platform; platform.python_version()import pip;pip.__version__import plotly; plotly.__version__import numpy as np; np.__version__import matplotlib; matplotlib.__version__import setuptools; setuptools.__version__import six; six.__version__import simplejson; simplejson.__version__import pandas; pandas.__version__import scipy; scipy.__version__Minimal Python 2
Download and install conda.
CONDA_VER="4.8.3"PYTHON_VER="py27"file="Miniconda2-${PYTHON_VER}_${CONDA_VER}-Linux-x86_64.sh"wget -q --show-progress --progress=bar:force -P /results \ https://repo.continuum.io/miniconda/${file}bash Miniconda2-py27_4.8.3-Linux-x86_64.sh -b -p /opt/condaSetup conda, ld, and pip.
# make this the last alphabetically => lowest precedence librariesecho "/opt/conda/lib" >> /etc/ld.so.conf.d/zz-conda.confmkdir ~/.conda/pkgs # prevent a warningconda config --set always_yes Truepip_ver=$(pip --version | sed 's/pip \(.*\) from.*/\1/')echo "pip >=$pip_ver" > /opt/conda/conda-meta/pinned # prevent pip downgradeecho "python =2.7" >> /opt/conda/conda-meta/pinned # Stick to Python 2.7conda update python pipconda update -yn base condaconda clean -qtipyldconfigpython -Vpip -Vdu -hsx /Default Python 2
Install
apt-get -qq updateDEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends \ libxext6 libhdf5-100apt-get cleanrm -r /var/lib/apt/lists/*conda install -c plotly \ setuptools six simplejson dill pillow pytables h5py \ plotly matplotlib tqdm termcolor tabulate \ python-dateutil more-itertools toolz cython cffi attrs decorator jedi \ numpy scipy patsy statsmodels pandas pandas-datareader seaborn \ scikit-learn scikit-image \ jupyterconda clean -qtipyldconfig# make sure jupyter components are up-to-date# also add non-anaconda-main packages here (conda-forge packages can be broken)pip install --upgrade altair pandas pyarrow feather-format \ pipenv jupyter-client jupyter-corepip install opencv-python-headlessmkdir -p ~/.config/matplotlib/echo 'font.family: sans-seriffont.sans-serif: Fira Sans, PT Sans, Open Sans, Roboto, DejaVu Sans, Liberation Sans, sans-seriffont.serif: PT Serif, Noto Serif, DejaVu Serif, Liberation Serif, seriffont.monospace: Fira Mono, Roboto Mono, DejaVu Sans Mono, Liberation Mono, Fixed, Terminal, monospace' > ~/.config/matplotlib/matplotlibrcpython -Vpip -Vjupyter --versionjupyter kernelspec listjupyter --pathsdu -hsx /PI_PKGS="altair, bleach, certifi, cffi, chardet, cloudpickle, conda, conda_package_handling, cryptography, cycler, cython, cytoolz, dask, decorator, defusedxml, dill, entrypoints, feather, h5py, idna, imageio, importlib_metadata, ipykernel, ipython_genutils, ipywidgets, jedi, jinja2, jsonschema, jupyter, jupyter_client, jupyter_console, jupyter_core, kiwisolver, lxml, markupsafe, matplotlib, mistune, mkl_fft, mkl_random, mock, more_itertools, nbconvert, nbformat, networkx, notebook, numexpr, numpy, olefile, cv2, pandas, pandas_datareader, pandocfilters, parso, patsy, pexpect, pickleshare, PIL, pipenv, plotly, prometheus_client, prompt_toolkit, ptyprocess, pyarrow, pycosat, pycparser, pygments, OpenSSL, pyparsing, pyrsistent, socks, dateutil, pytz, pywt, zmq, qtconsole, requests, retrying, ruamel_yaml, skimage, sklearn, scipy, seaborn, send2trash, simplejson, six, statsmodels, tables, tabulate, termcolor, terminado, testpath, toolz, tornado, tqdm, traitlets, urllib3, virtualenv, wcwidth, webencodings, widgetsnbextension, zipp, ipykernel.pylab.backend_inline"python -c "import $PI_PKGS"Test
python --versionjupyter kernelspec listjupyter --versionjupyter --pathsimport platform; platform.python_version()