Demo: fetching files from AWS compatible object storage

This notebook exists to demonstrate how you can use S3 compatible object storage from from providers other than AWS, like Scaleway, as a replacement for your use of S3.

We use them at the Green Web Foundation for a few reasons, mostly around climate and sustainability. AWS may well be the default choice, but even if using AWS is efficient and the default, we feel using them is inconsistent with our own goals.

Given the resources they have available they could be moving much faster on shifting away from powering their own infrastructure on fossil fuels, and while they are aggressively chasing big oil and gas contracts, we'd rather use services from folks who choose to make money speeding up an energy transition away from fossils, rather than slowing it down. As Alex Steffen says:

When it comes to climate, speed is justice.

Anyway - Scaleway isn't the only option, but they do have a good range of services, and have a better climate record than many other providers.

First, install boto3, our library to talking to AWS like services.

You can probably do the same with your own language if there is a library that lets you set regions and endpoints to connect to like this demo.

Extra points if you can tag @mrchrisadams on twitter with a link to you demoing this in your language - I'll update the this doc to link to them.

pip install boto3
8.2s

Next we need some secrets to use to make our connection.

import os
OBJECT_STORAGE_REGION = os.getenv('OBJECT_STORAGE_REGION')
OBJECT_STORAGE_ENDPOINT = os.getenv('OBJECT_STORAGE_ENDPOINT')
OBJECT_STORAGE_ACCESS_KEY_ID = os.getenv('OBJECT_STORAGE_ACCESS_KEY_ID')
OBJECT_STORAGE_SECRET_ACCESS_KEY = os.getenv('OBJECT_STORAGE_SECRET_ACCESS_KEY')
0.0s

Tell boto we want to connect to a different region that the default AWS one

Regions are big things, and they're a handy 'seam' for us to sub in a different provider of object storage.

For something like Scaleway, we can choose a number of different regions, including Paris (fr-par), Amsterdam (nl-ams) or Warsaw, Poland (pl-waw). See the [full list in their own object storage docs.

See more in their own docs.

OBJECT_STORAGE_REGION = os.getenv('OBJECT_STORAGE_REGION')
import boto3
session = boto3.Session(region_name=OBJECT_STORAGE_REGION)
0.4s

Next, we use this session to connect to our storage resource.

Once we're connected the correct region, we authenticate like normal, with an access key, and corresponding secret, just like with AWS.

However, we use a different endpoint to send our requests to, instead of Amazon's API servers.

In this case we might have an endpoint referring to corresponding endpoint for our region and provide, like for Scaleway in Amsterdam, we might have: https://s3.nl-ams.scw.cloud

object_storage = session.resource(
    "s3",
    endpoint_url=OBJECT_STORAGE_ENDPOINT,
    aws_access_key_id=OBJECT_STORAGE_ACCESS_KEY_ID,
    aws_secret_access_key=OBJECT_STORAGE_SECRET_ACCESS_KEY,
)
0.2s

Finally, we look inside our bucket

Our object_storage object has a handy subresource called Bucket , which has a few handy methods for iterating through its contents.

On python, if we wanted to run through the contents of a lazy iterator, we could use a list comprehension like so:

bucket = object_storage.Bucket('nextjournal-demo')
# return the contents of our bucket
[obj.key for obj in bucket.objects.all()]
0.4s

And now we know that's there, we can download the file:

bucket.download_file('good-smiley-fb.png', '/results/good-smiley-fb.png')
0.3s

It's (largely) that simple!

Design against protocols, not providers

We've started doing this with the Green Web Foundation because we think while object storage is a useful technology, we think it's important to be mindful of the 'seams' of the digital services we build, to keep options open for when choosing providers to build on top of.

This is also why we use NextJournal - having every journal downloadable as a dockerfile, and in various notebook formats means we can take our work to other providers if need be. We also like how they share how their platform is built, and what their stack is under the hood.

Using another provider of object storage

Try coping this notebook, and try it with your own S3 compatible provider - it's a nice easy to understand way to demonstrate compatibility.

Runtimes (1)