The development of agriculture in Germany

1. Introduction

This analysis was done for WDR’s story on on the development of agriculture in Germany. It is a first pilot to introduce data-driven analyses and storytelling for one of Germany’s most prestigious science journalism formats.

This notebook documents the underlying data gathering and analysis process. It has been produced by Studio NAND together with ... and ... of

1.1. Technical Setup

pip install altair==2.3.0 xlrd==1.2.0

We start by setting up coders so when a cell returns an altair.Chart it's automatically displayed as an iframe and when it returns a pandas.DataFrame it renders it as nicely as an R table.

def __is_altair_chart(value): 
    import altair 
    return isinstance(value, altair.Chart) 
    return False 
def __write_altair(chart, path):, format='html') 
  return {'content-type': 'text/html', 'kind': 'iframe'} 

register_coder('altair', lambda value: __is_altair_chart(value), lambda value, path: __write_altair(value, path))

def __is_pandas_dataframe(value): 
    import pandas 
    return isinstance(value, pandas.DataFrame) 
    return False 
def __write_dataframe(df, path): 
  return {'content-type': 'text/csv', 'kind': 'table'} 

register_coder('pandas', lambda value: __is_pandas_dataframe(value), lambda value, path: __write_dataframe(value, path))
<map object at 0x7f339eb3cdd8>

2. Analysis

2.1. Farms

The statistical surveys have used different bins over the years to report numbers of farms and total space used over the years. These bins have been manually unified. A few errors in the Excel file were corrected manually. This was the case for some totals (minor inconsistencies). Also inconsistent were the use of number formatting and actual numerical values included in the table. Some cells contain absolute values and use Excel’s number format to display 1000s ha, others contain the 1000s ha as the value itself.

This is the Excel file resulting from the manual processing in a Google Sheet.


We are using Pandas to transform this table from wide to long format for better compatibility with Altair and Tableau.

import pandas as pd

df = pd.read_excel(
, 'Betriebe') # ‘un-pivot’ data (wide to long format) df = pd.melt(df, ('Betriebsgröße', 'Typ', 'Land'), var_name='Jahr', value_name='Wert') df['Jahr'] = pd.to_datetime(df['Jahr'], format='%Y') df['Betriebsgröße'] = df['Betriebsgröße'].str.strip() # write the result to the file system to download df.to_csv('results/betriebe.csv', index=False) df
0weniger als 5Zahl der BetriebeFrüheres Bundesgebiet1949-01-011003848.0
15 bis 10Zahl der BetriebeFrüheres Bundesgebiet1949-01-01403699.0
210 bis 20Zahl der BetriebeFrüheres Bundesgebiet1949-01-01256255.0
320 bis 50Zahl der BetriebeFrüheres Bundesgebiet1949-01-01112421.0
450 bis 100Zahl der BetriebeFrüheres Bundesgebiet1949-01-0112621.0
5100 und mehrZahl der BetriebeFrüheres Bundesgebiet1949-01-012971.0
6weniger als 5Fläche der Betriebe in 1 000 haFrüheres Bundesgebiet1949-01-012330.373
75 bis 10Fläche der Betriebe in 1 000 haFrüheres Bundesgebiet1949-01-012860.147
810 bis 20Fläche der Betriebe in 1 000 haFrüheres Bundesgebiet1949-01-013543.169
920 bis 50Fläche der Betriebe in 1 000 haFrüheres Bundesgebiet1949-01-013244.029
312 items
showing page 1/32
import altair as alt

categories_order = ['weniger als 5',
                    '5 bis 10',
                    '10 bis 20',
                    '20 bis 50',
                    '50 bis 100',
                    '100 und mehr']

def show(df):
	return alt.Chart(df).mark_area().encode(

2.1.1. Total farm land by size Federal Republic of Germany
size_frd = df[(df['Typ'] == 'Fläche der Betriebe in 1 000 ha') &
              (df['Land'] == 'Früheres Bundesgebiet')]

show(size_frd.sort_values(by='Betriebsgröße')) Former German Democratic Republic

Agriculture in the former GDR was centrally organized through large cooperatives owned by the state. This almost completely eradicated small, privately run farms which is immediately apparent in the first statistical surveys of farm sizes after the German reunification.

size_gdr = df[(df['Typ'] == 'Fläche der Betriebe in 1 000 ha') &
              (df['Land'] == 'Neue Länder')]


2.2. Number of farms by size

Looking at the number of farms in each size group, we can observe a trend across Germany: the general decline of farms. This is likely the result from increasing economic pressure on farmers to produce more goods at lower prices.

2.2.1. Federal Republic of Germany

count_fdr = df[(df['Typ'] == 'Zahl der Betriebe') &
              (df['Land'] == 'Früheres Bundesgebiet')]


2.2.2. Former German Democratic Republic

count_gdr = df[(df['Typ'] == 'Zahl der Betriebe') &
              (df['Land'] == 'Neue Länder')]


3. Agricultural usage by size