Benchmark Guide

ClickBench

Download dataset

To download partitioned dataset (~100MB):

wget https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_0.parquet -O benchmark/data/hits_0.parquet

To download the entire dataset (~15GB):

wget https://datasets.clickhouse.com/hits_compatible/athena/hits.parquet -O benchmark/clickbench/data/hits.parquet

To download the partitioned dataset (100 files, ~150MB each):

for i in (seq 0 99)
    wget https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_$i.parquet -O benchmark/clickbench/data/partitioned/hits_$i.parquet
end

Or bash :

for i in {0..99}; do
    wget https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_$i.parquet -O benchmark/clickbench/data/partitioned/hits_$i.parquet
done

Run benchmarks

Minimal

cargo run --release --bin bench_server
cargo run --release --bin clickbench_client -- --query-path benchmark/clickbench/queries/queries.sql --file benchmark/clickbench/data/hits.parquet

Advanced

env RUST_LOG=info RUST_BACKTRACE=1 RUSTFLAGS='-C target-cpu=native' cargo run --release --bin bench_server
env RUST_LOG=info RUST_BACKTRACE=1 RUSTFLAGS='-C target-cpu=native' cargo run --release --bin clickbench_client -- --query-path benchmark/clickbench/queries/queries.sql --file benchmark/clickbench/data/hits.parquet --query 42

TPCH

Generate data

(make sure you have uv installed)

cd benchmark/tpch
uvx --from duckdb python tpch_gen.py --scale 0.01

Run server (same as ClickBench)

cargo run --release --bin bench_server

Run client

env RUST_LOG=info,clickbench_client=debug RUSTFLAGS='-C target-cpu=native' cargo run --release --bin tpch_client -- --query-dir benchmark/tpch/queries/ --data-dir benchmark/tpch/data/sf0.1  --iteration 3 --bench-mode liquid-eager-transcode --answer-dir benchmark/tpch/answers/sf0.1

Profile

Flamegraph

To collect flamegraph from server side, simply add --flamegraph-dir benchmark/data/flamegraph to the server command, for example:

cargo run --release --bin bench_server -- --flamegraph-dir benchmark/data/flamegraph

It will generate flamegraph for each query that the server executed.

Cache stats

To collect cache stats, simply add --stats-dir benchmark/data/cache_stats to the server command, for example:

cargo run --release --bin bench_server -- --stats-dir benchmark/data/cache_stats

It will generate a parquet file that contains the cache stats for each query that the server executed. You can use parquet-viewer to view the stats in the browser.

Run encoding benchmarks

RUST_LOG=info RUSTFLAGS='-C target-cpu=native' cargo run --release --bin encoding -- --file benchmark/clickbench/data/hits.parquet --column 2

This will benchmark the encoding time of the URL column.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Benchmark Guide

ClickBench

Download dataset

Run benchmarks

Minimal

Advanced

TPCH

Generate data

Run server (same as ClickBench)

Run client

Profile

Flamegraph

Cache stats

Run encoding benchmarks

Files

README.md

Latest commit

History

README.md

File metadata and controls

Benchmark Guide

ClickBench

Download dataset

Run benchmarks

Minimal

Advanced

TPCH

Generate data

Run server (same as ClickBench)

Run client

Profile

Flamegraph

Cache stats

Run encoding benchmarks