What's the carbon footprint of that oil and gas AI contract?

If you think about AI and the impact it has on the world, you might think about the direct impact it has, and with good reason - all the electricity has to from somewhere, and majority of it comes from burning fossil fuels. It's also worth thinking about what it enables too though.

Given COP26 has been very much about climate justice and getting off of fossil fuels, it seemed worth exploring the climate impact of big oil and gas contracts that a number of large tech companies are aggressively chasing.

How much does this AI work fossil expand fuel extraction?

It's hard to find many reliable numbers, but if you look at press releases from large companies, you can find some figures to at least start with.

Take this contract listed in Reuters:

Exxon has pledged to increase its Permian Basin production to 600,000 barrels of oil equivalent per day (boepd) by 2025. The company’s fourth-quarter Permian production was 190,000 barrels of oil and gas per day.

That's an increase of 400k barrels. It's an official statement from Exxon, and it would be reasonable to assume the the numbers have been approved by people at both Microsoft and Exxon.

Maybe you feel like that can't all be attributed to Microsoft's tools specifically. How about we look at a press release from Microsoft themselves?

The application of Microsoft technologies by ExxonMobil’s XTO Energy subsidiary – including Dynamics 365, Microsoft Azure, Machine Learning and the Internet of Things – is anticipated to improve capital efficiency and support Permian production growth by as much as 50,000 oil-equivalent barrels per day by 2025.

That's lower. If you wanted to be conservative, you might choose the 50k oil equivalent per day figure, because it's easier to say:

if it weren't for Microsoft's tech being used there, those 50k barrels wouldn't be extracted every day.

If you felt like the expansion wouldn't happen at all without Microsoft's help then you might say:

if it weren't for Microsoft's tech being used there, those 400k barrels wouldn't be extracted every day

In both cases, more fossil fuel extraction is happening, but how much you might allocate to each party is something you'd make a judgement call on yourself.

Let's look at another contract mentioned, in this article, in Logic Mag:

The multi-million-dollar partnership between Microsoft and Chevron was the reason I went to Kazakhstan. Microsoft sent me to Atyrau for a week-long workshop to help the Tengiz oil field adopt our technology. I was there to talk about computer vision, a field of AI/ML that gives computers the ability to understand digital images, but the workshop covered a range of topics in both AI/ML and cloud computing. We held it for a team at TCO tasked with boosting daily oil production from 600,000 barrels to 1 million. They wanted to learn about how Microsoft technology could help them modernize their oil field and increase efficiency.

That's another 400k barrels. This is harder to confirm independently, because the author, Zero Cool, is a nom de plume for an employee at Microsoft. It's not like you can just email Microsoft to ask for confirmation about numbers released by pseudonymous whistleblowers. The same issues about attributing it all to Microsoft apply too - it's very much a judgement call about how much you would attribute to one party over another.

How much of an oil barrel is typically burned once it's dug up?

If we have an idea of how much more oil is being taken out of the ground, and we know that at least some of it is burned, we can make an estimate for the extra carbon emissions caused by doing so.

A really quick way to come up with this number is to just look at the helpful diagram from an explainer page on from the Petroleum Services Company:

How much of an oil barrel is burned, vs used for other things?

Let's add these up:

  • 46% for burning as gas(oline)

  • 26% burned as diesel

  • 9% as jet fuel

Add those together, you get 81%. The actual amount may be higher as we know that shipping is reliant on bunker fuel, a really nasty, dirty fuel from the bottom of the barrel. Until a kind soul suggests a number, let's use these for now.

How do you model the impact of this?

If you have an idea of how many barrels are being extracted, and you have a reasonable estimate of how much of a barrel ends up being burned, you can arrive at some indicative CO2 emissions figures, by applying an emissions factor for the quantity of oil burned.

The US Environmental Protection Agency, the EPA has a handy greenhouse gas equivalency calculator and they go into plenty of detail for how they arrive at these figures for working out the CO2 emitted from burning a barrel of oil. But the short version is that for every barrel burned, you can assume 0.43 metric tonnes of CO2 emitted.

Let's represent this with some pretty simple multiplications:

def barrel_to_co2(no_of_barrels):
  co_2_per_barrel = 0.43   # 0.43 metric tonnes
  percentage_burned = 0.81 # 81% of the barrel's contents being burned for fuel
  
  return co_2_per_barrel * percentage_burned * no_of_barrels
0.1s

Now we have this, let's work out that figure for 50k barrels per day:

f"For every 50k barrels extracted, we can assume {barrel_to_co2(50_000)} tonnes of CO2 emitted"
0.0s

Working out the total emissions from a year

That's a figure per day, but what would we compare it to?

We can multiply it by 365 to get an annual figure.

DAYS_IN_A_YEAR = 365
f"For every 50k barrels extracted a day for a whole year, we can assume {barrel_to_co2(50_000) * DAYS_IN_A_YEAR} tonnes of CO2 emitted  "
0.0s

How about the same company's reported carbon emissions?

Microsoft's reported carbon footprint in 2019 was 12,231,000 metric tonnes.

This contract, even when we take the conservative approach, is likely responsible for 6,356,475 metric tonnes of CO2 being emitted, based on the simple multiplication approach above.

Let's try comparing this to other well known technology companies.

I've left out Apple and Amazon, because they only seem to have published their carbon emissions after applying market based adjustments for scope 2 - this is accepted under the GHG Protocol, but a little bit like only publishing profit figures, rather than revenue and costs, so I'll address them separately.

company_vs_contracts = [
  ("Google", 17_394_635,),
  ("Microsoft", 12_231_000,),
  ("Facebook", 6_295_000,),
  ("Microsoft's contract with Exxon", 6_356_475,),
]
0.1s
import altair
import pandas
data = pandas.DataFrame(data=company_vs_contracts, columns=["Company", "Annual CO2e"])
data
0.1s

However, there's slight problem - there's likely a bunch of uncertainty in our oil calculations. We could have an error bar or similar, but I don't know how precise the reported emissions are either.

If we round this to the nearest million tonnes of CO2 emitted, we're at least not implying precision that isn't really there in these numbers, and it ought to make it easier to read too.

def round_to_millions(num):
  # https://stackoverflow.com/questions/3410976/how-to-round-a-number-to-significant-figures-in-python
  return round(num, -5) / 1_000_000
# add our new column
data['Annual CO2e millions of tonnes'] = round_to_millions(data["Annual CO2e"])
data
0.1s

Now we can make a simple chart.

comparison_chart = altair.Chart(data=data, mark='bar', height=300).encode(
  y='Company',
  x="Annual CO2e millions of tonnes",
  color='Company',
  tooltip=["Company", "Annual CO2e - millions of tonnes"]
)
comparison_chart
0.5s

This one contract, and we know this is probably not the only contract Microsoft has, has comparable emissions to all of Facebook's reported organisational footprint, and half of Microsoft's own footprint for 2019.

Including Apple and Amazon, and the larger contracts

How about if we include the figures from Apple, and Amazon, who have only reported their scope 2 market based emissions figures, and we include the whacking great Kazakstan Tengiz oil field contract mentioned above?

f"For every 400k barrels extracted a day for a whole year, we can assume {barrel_to_co2(400_000) * DAYS_IN_A_YEAR} tonnes of CO2 emitted  "
0.0s

The article mentioned 400k barrels per day, so let's try using that.

We're less certain about these figures, but we're less certain about the figures from Amazon and Apple too - we don't know how much we might attribute to Microsoft's AI services, but in the case of the Apple and Amazon because they're not disclosing the the location based figures as well as the market based figures, then it's not showing the full picture there either.

more_companies_vs_contracts = [
  *company_vs_contracts,
  ("Amazon", 51_170_000),
  ("Apple", 25_100_000),
  ("Microsoft contract in Tengiz", 50_851_800),
]
extended_data = pandas.DataFrame(data=more_companies_vs_contracts, columns=["Company", "Annual CO2e"])
extended_data['Annual CO2e millions of tonnes'] = round_to_millions(extended_data["Annual CO2e"])
extended_data
0.1s

Let's make it a chart so it's easier to compare them visually:

extended_comparison_chart = altair.Chart(data=extended_data, mark='bar', height=300, width=500).encode(
  y='Company',
  color="Company", 
  x="Annual CO2e millions of tonnes",
  tooltip=["Company", "Annual CO2e millions of tonnes"]
)
extended_comparison_chart
0.4s

When we add the numbers from the larger companies the single Permian contract doesn't look quite as massive. But at the same time, the Tengiz oilfield contract isn't far off the reported emissions from Amazon in 2019.

Even if the numbers for these contracts are off, it points to a larger issue - there are definitely more than just two oil and gas contracts signed to use AI to accelerate extraction, and it flies in the face of all the science about what we need to do to avoid the worst impacts of the climate crisis.

When we talk about sustainability in the digital realm, does help to be aware of the direct emissions from tech companies. However, it's if we don't talk about what the technology is used to enable, we miss a key part of the picture.

If your advice is basically move all your workloads to the companies accellerating fossil fuel extraction, then while it might be the convenient thing to do - see the green web triangle, it's probably not the systemically aware thing to do.

Would oil supply here not displace oil being produced elsewhere though?

This is valid question, and if you look at the Greenpeace Oil in the Cloud from 2019 report they do comment on this:

An increase in the supply of oil of 43,000 bbl/day due to technological advances can be expected to lower global oil prices and increase global oil consumption.

This increase in supply will displace some oil production elsewhere in the world, but 100% displacement is unlikely in most circumstances. A review of the literature on oil market elasticities has concluded that a one barrel increase in oil supply leads to a roughly half-barrel increase in global oil consumption (with broad uncertainties).

[86] A reasonable range for this “displacement factor” runs from 0.2 to 0.8 barrels consumed for each additional barrel produced.[87]

Applying this very conservative assumption to the additional liquids production, we find that this increase in liquids production alone would lead to an increase in global carbon emissions from oil consumption of roughly 3.4 MtCO2-eq/y.

Greenpeace's calculations are more detailed than this notebook here - they split the barrel of oil into liquid oil and fossil gas, which is how 50,000 barrels a day becomes 43,000 barrels liquid (bbl/day).

They also apply a displacement factor of around 0.5 for the Exxon contract figure.

So, if we followed the same approach we might go from 6.4 million tonnes to around 3.4 million tonnes.

This would mean the contract would only be half of Facebook, not half of Microsoft.

Where are these company figures coming from?

The numbers for company emissions are from their own reports, listed below. In every case below the 2020 report will have the figures for 2019.

Using this data and sharing it yourself.

The contents of this notebook are licensed Creative Commons CC-BY-SA.

You're welcome to share these charts and this notebook, and remix it and try adapting it to other work, as long as you share alike and show attribution.

Generating these charts

The next steps are somewhat to Nextjournal the platform used here to host this notebook, and generate the charts, so they're in a downloadable, reusable form. You can freely skip past these to the next section.

# install the necessary depedencies altair needs to generate charts
! curl -fsSL https://deb.nodesource.com/setup_12.x | sudo -E bash -
! apt-get -qq update
! apt-get -qq install nodejs chromium-chromedriver
! npm install --silent vega-lite vega-cli canvas
30.9s
# install the extra plugin to write charts to files
! pip install -q altair_saver
6.1s

Now we can generate our downloadable files.

from altair_saver import save
0.0s

Downloading and sharing these charts

You can download these charts as resolution independent svgs - you might use this in design work, or if you needed to edit them further before sharing them.

save(comparison_chart, './results/comparison_chart.svg')
save(extended_comparison_chart, './results/extended_comparison_chart.svg')
1.2s

Alternatively these are available as png files that we can include in presentations, social media or blog posts.

save(comparison_chart, './results/comparison_chart.png', scale_factor=2.0)
save(extended_comparison_chart, './results/extended_comparison_chart.png', scale_factor=2.0)
1.2s

If you know your way around vega lite, you can download the vega spec json file

save(comparison_chart, './results/comparison_chart.json')
save(extended_comparison_chart, './results/extended_comparison_chart.json')
0.1s

Finally, if you want to download a standalone HTML page for each chart, you can do so below.

save(comparison_chart, './results/comparison_chart.html')
save(extended_comparison_chart, './results/extended_comparison_chart.html')
0.1s
Runtimes (1)