The development of agriculture in Germany
1. Introduction
This analysis was done for WDR’s story on Quarks.de on the development of agriculture in Germany. It is a first pilot to introduce data-driven analyses and storytelling for one of Germany’s most prestigious science journalism formats.
This notebook documents the underlying data gathering and analysis process. It has been produced by Studio NAND together with ... and ... of quarks.de
1.1. Technical Setup
pip install altair==2.3.0 xlrd==1.2.0
We start by setting up coders so when a cell returns an altair.Chart
it's automatically displayed as an iframe and when it returns a pandas.DataFrame
it renders it as nicely as an R table.
def __is_altair_chart(value): try: import altair return isinstance(value, altair.Chart) except: return False def __write_altair(chart, path): chart.save(path, format='html') return {'content-type': 'text/html', 'kind': 'iframe'} register_coder('altair', lambda value: __is_altair_chart(value), lambda value, path: __write_altair(value, path)) def __is_pandas_dataframe(value): try: import pandas return isinstance(value, pandas.DataFrame) except: return False def __write_dataframe(df, path): df.to_csv(path) return {'content-type': 'text/csv', 'kind': 'table'} register_coder('pandas', lambda value: __is_pandas_dataframe(value), lambda value, path: __write_dataframe(value, path))
<map object at 0x7f339eb3cdd8>
2. Analysis
2.1. Farms
The statistical surveys have used different bins over the years to report numbers of farms and total space used over the years. These bins have been manually unified. A few errors in the Excel file were corrected manually. This was the case for some totals (minor inconsistencies). Also inconsistent were the use of number formatting and actual numerical values included in the table. Some cells contain absolute values and use Excel’s number format to display 1000s ha, others contain the 1000s ha as the value itself.
This is the Excel file resulting from the manual processing in a Google Sheet.
We are using Pandas to transform this table from wide to long format for better compatibility with Altair and Tableau.
import pandas as pd df = pd.read_excel(Data.xlsx, 'Betriebe') # ‘un-pivot’ data (wide to long format) df = pd.melt(df, ('Betriebsgröße', 'Typ', 'Land'), var_name='Jahr', value_name='Wert') df['Jahr'] = pd.to_datetime(df['Jahr'], format='%Y') df['Betriebsgröße'] = df['Betriebsgröße'].str.strip() # write the result to the file system to download df.to_csv('results/betriebe.csv', index=False) df
Betriebsgröße | Typ | Land | Jahr | Wert | |
---|---|---|---|---|---|
0 | weniger als 5 | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 1003848.0 |
1 | 5 bis 10 | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 403699.0 |
2 | 10 bis 20 | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 256255.0 |
3 | 20 bis 50 | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 112421.0 |
4 | 50 bis 100 | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 12621.0 |
5 | 100 und mehr | Zahl der Betriebe | Früheres Bundesgebiet | 1949-01-01 | 2971.0 |
6 | weniger als 5 | Fläche der Betriebe in 1 000 ha | Früheres Bundesgebiet | 1949-01-01 | 2330.373 |
7 | 5 bis 10 | Fläche der Betriebe in 1 000 ha | Früheres Bundesgebiet | 1949-01-01 | 2860.147 |
8 | 10 bis 20 | Fläche der Betriebe in 1 000 ha | Früheres Bundesgebiet | 1949-01-01 | 3543.169 |
9 | 20 bis 50 | Fläche der Betriebe in 1 000 ha | Früheres Bundesgebiet | 1949-01-01 | 3244.029 |
import altair as alt categories_order = ['weniger als 5', '5 bis 10', '10 bis 20', '20 bis 50', '50 bis 100', '100 und mehr'] def show(df): return alt.Chart(df).mark_area().encode( x=alt.X('Jahr:T'), y=alt.Y("Wert:Q"), color=alt.Color( 'Betriebsgröße:N', sort=categories_order ) )
2.1.1. Total farm land by size
2.1.1.1. Federal Republic of Germany
size_frd = df[(df['Typ'] == 'Fläche der Betriebe in 1 000 ha') & (df['Land'] == 'Früheres Bundesgebiet')] show(size_frd.sort_values(by='Betriebsgröße'))
2.1.1.2. Former German Democratic Republic
Agriculture in the former GDR was centrally organized through large cooperatives owned by the state. This almost completely eradicated small, privately run farms which is immediately apparent in the first statistical surveys of farm sizes after the German reunification.
size_gdr = df[(df['Typ'] == 'Fläche der Betriebe in 1 000 ha') & (df['Land'] == 'Neue Länder')] show(size_gdr)
2.2. Number of farms by size
Looking at the number of farms in each size group, we can observe a trend across Germany: the general decline of farms. This is likely the result from increasing economic pressure on farmers to produce more goods at lower prices.
2.2.1. Federal Republic of Germany
count_fdr = df[(df['Typ'] == 'Zahl der Betriebe') & (df['Land'] == 'Früheres Bundesgebiet')] show(count_fdr)
2.2.2. Former German Democratic Republic
count_gdr = df[(df['Typ'] == 'Zahl der Betriebe') & (df['Land'] == 'Neue Länder')] show(count_gdr)