Making science reproducible @nextjournal

OpenNEX DCP30 Analysis

This notebook illustrates how to analyze a subset of OpenNEX DCP30 data using Python and pandas. Specifically, we will be analyzing temperature data in the Chicago area to understand how the CESM1-CAM5 climate model behaves under different RCP scenarios during the course of this century.

A dataset for this example is available at On that page you will find a bash script that can be used to deploy a Docker container which will serve the selected data. Deployment of the container is beyond the scope of this example.

Import Modules

Let's begin by importing the required modules. We'll need pandas for analysis and plotly to create a chart of our analysis.

import pandas as pd
import plotly.graph_objs as go

Loading Data


The load_data function reads data directly from your access server's endpoint. It accepts the ip_addr parameter, which must correspond to the IP address of your data access server.

Alternatively, you can try using this data file:

data = pd.read_csv(
) for col in ['Model', 'Scenario', 'Variable']: data[col] = data[col].astype('category') data['Date'] = data['Date'].astype('datetime64') data['Temperature'] = data['Value'] - 273.15

It's easier to work with the resulting data if we tell pandas about the date and categorical columns. The function declares these column types, and also converts the temperature from degrees Kelvin to degrees Celsius.

Putting it all Together

Let's load the data, quickly inspect it using the head method, then use do_graph to visualize it.


Plotting the Scenarios

After loading that data, we can use plotly to visualize what the model predicts over the course of this century. This function reduces the data to show the warmest month for each year and displays the values under each RCP scenario.

model = data.loc[1, 'Model']
title = "Maximum mean temperature for warmest month using model %s" % (model)

data['Year'] = data['Date'].map(lambda d: "%d-01-01" % (d.year)).astype('datetime64')
by_year = data.groupby(['Year', 'Scenario']).max().loc[:,['Temperature']]
groups = by_year.reset_index().set_index('Year').groupby('Scenario')

plot_data = [{'x': grp.index, 'y': grp['Temperature'], 'name': key} for key, grp in groups]

layout = {'xaxis': {'title': 'Year'}, 'yaxis': {'title': 'Temperature [Celsius]'}}
go.Figure(data=plot_data, layout=layout)


The plot above begins with a brief historical period at the start of the century, then presents data from the four RCP scenarios. We can see annual fluxuations as well as a clear divergence towards the end of the century. As expected, the most aggressive warming scenario, rcp85, produces the warmest temperatures.