Simon Danisch / Oct 17 2018

Minimal Databench

Include the datageneration code from the parent article:

G1_1e7_1e2.csv

Extract the first benchmark:

using DataFrames, CSV
using Statistics # mean function
using BenchmarkTools
data_name = G1_1e7_1e2.csv
x = CSV.read(data_name, categorical=false);
inner(df) = return DataFrame(v1 = sum(df.v1))
@btime by($inner, $x, :id1);

Make a new runtime to test with the grouping2 branch:

pkg"add DataFrames#nl/grouping2"
using DataFrames, CSV
using Statistics # mean function
using BenchmarkTools
data_name = G1_1e7_1e2.csv
x = CSV.read(data_name, categorical=false);
inner(df) = return (v1 = sum(df.v1),)
@btime by($inner, $x, :id1);
size(x)