SQLite-Columnar Benchmarks

sqlite-columnar Benchmark

Date: 2026-05-10

Single-run command:

make
make benchmarks
build/columnar-analytics-bench ./columnar 10000000 256

For variance-aware runs, use the repeatable benchmark suite:

make variance-bench VARIANCE_REPEATS=9 \
  VARIANCE_DATASETS="small:10000:64 medium:50000:128 wide:50000:512"

The variance suite emits machine-readable lines:

  • dataset,... records load/analyze time, storage size, and zone-map coverage.
  • query,... records row-store and columnar median/p95 timings, median/p95 speedups, result row count, and normalized result hash.

Each query is warmed once before sampling and every sample verifies that the columnar result hash matches the row-store result hash.

Variance Benchmark Results

Command:

make variance-bench VARIANCE_REPEATS=3 \
  VARIANCE_DATASETS='large100k:100000:128 large1m:1000000:128 large10m:10000000:256'

With three repeats, p95 is effectively the slowest sampled run. Use more repeats for publishable tail-latency claims.

Load and analyze:

DatasetRow populate msColumnar populate msAnalyze msRow bytesColumnar bytesSelected chunks
large100k112.364482.392479.47431,596,54442,725,3761 / 2
large1m908.4744,514.6335,287.540315,879,424428,773,3761 / 16
large10m23,528.26565,145.17167,766.3825,867,487,2327,038,214,1443 / 153

10M query medians:

QueryRow median msColumnar median msMedian speedupRow p95 msColumnar p95 msp95 speedup
columnar_sum(v1)4,176.3590.032130,582.95x4,492.2310.13932,337.75x
columnar_avg(v3)4,135.8650.032129,316.82x4,188.9480.04592,745.57x
columnar_count(v1)4,003.9960.032125,193.65x4,028.5050.038106,456.49x
Generic columnar GROUP BY id1, sum(v1)8,701.0825,773.0051.51x9,863.3316,944.5271.42x
columnar_group_sum(id1, v1)7,985.5741,322.1876.04x8,156.2541,322.5396.17x
columnar_group_count(id1)7,921.773560.80414.13x7,935.962561.12414.14x
columnar_group_sum_avg_count(id1, v1)8,476.2381,320.3216.42x8,592.9081,339.4986.42x
columnar_group_min_max_count(id3, v1)6,078.3971,410.1724.31x6,141.3141,416.8344.33x
Specialized clustered WHERE ts BETWEEN 100000 AND 2000004,302.62317.287248.89x4,892.10717.294282.88x
Specialized clustered GROUP BY id1, sum/avg/count(v1)3,962.00114.502273.20x4,122.62414.648281.45x

Median speedup by dataset size:

Query100k1M10M
columnar_sum(v1)449.32x4,243.82x130,582.95x
columnar_avg(v3)536.95x4,484.11x129,316.82x
columnar_count(v1)480.88x4,060.56x125,193.65x
columnar_group_sum(id1, v1)2.13x2.89x6.04x
columnar_group_count(id1)4.15x6.74x14.13x
Specialized clustered range filter1.45x15.45x248.89x
Specialized clustered grouped sum/avg/count2.96x37.23x273.20x

Single-Run Analytical Benchmark

Dataset:

  • Rows: 10,000,000
  • Wide fact table with dimensions ts, id1..id6, measures v1..v3, and cold unused payload columns.
  • Cold text payload: 2 columns x 256 bytes.
  • ts is clustered/monotonic, which lets chunk zone maps prune rowid ranges.
  • Chunk size: 65,536 rowids.

Load and analyze:

MetricValue
Row-store populate17,523.098 ms
Columnar populate62,640.616 ms
Initial columnar_analyze() with global stats + chunk zone maps66,763.592 ms
Incremental columnar_analyze() no-op2.114 ms
Incremental columnar_analyze() after one inserted row82.998 ms
Row-store bytes5,867,487,232
Columnar bytes7,038,214,144
Metadata after initial analyzerow_count=10,000,000, chunk_count=153, dirty_count=0, stats_valid=1
ts zone-map chunks selected3 / 153
ts full-cover chunks eligible for aggregate pushdown1 / 153
Dirty entries after one inserted row14
Metadata dirty count after one inserted row14
Dirty entries after incremental analyze0
Metadata after incremental analyzerow_count=10,000,001, chunk_count=153, dirty_count=0, stats_valid=1

Query results:

QueryRow-store msColumnar msSpeedup
sum(v1) via columnar_sum4,014.1350.04981,800.48x
avg(v3) via columnar_avg3,771.0100.032117,908.83x
count(v1) via columnar_count3,621.5970.031116,803.63x
Generic columnar GROUP BY id1, sum(v1)8,243.5317,116.8001.16x
columnar_group_sum(id1, v1)7,927.1671,325.4685.98x
columnar_group_avg(id1, v3)7,837.2151,456.8625.38x
columnar_group_count(id1)7,483.165575.00413.01x
columnar_group_sum_avg_count(id1, v1)7,802.1181,383.3485.64x
Generic GROUP BY id3, sum(v1), avg(v3)5,921.3535,385.5961.10x
Generic GROUP BY id3, max(v1)-min(v2)5,744.7656,754.2250.85x
columnar_group_min_max_count(id3, v1)6,171.2511,754.6623.52x
columnar_group_range(id3, v1, v2)6,119.3283,485.1481.76x
Generic WHERE id2 BETWEEN 10 AND 204,428.7083,461.5851.28x
Specialized WHERE id2 BETWEEN 10 AND 203,982.0781,477.8952.69x
Generic clustered WHERE ts BETWEEN 100000 AND 2000003,931.4853,133.4161.25x
Specialized clustered WHERE ts BETWEEN 100000 AND 2000003,819.23924.971152.95x
Specialized clustered GROUP BY id1, sum(v1)3,932.21218.469212.91x
Specialized clustered GROUP BY id1, sum/avg/count(v1)3,691.28814.654251.90x

Interpretation:

  • Global sum/avg/count are accelerated by precomputed stats.
  • Generic virtual-table scans benefit from reading fewer columns but still pay SQLite row materialization and generic aggregation costs.
  • Specialized grouped functions avoid generic row materialization.
  • Range-filtered functions use chunk zone maps. They help modestly on uniformly distributed filters such as id2, and dramatically on clustered filters such as ts, where only 3 of 153 chunks are scanned.
  • Scalar range-filtered aggregates use precomputed chunk sum/count for full-cover chunks. In this run, 1 of the 3 selected ts chunks is served from chunk stats instead of row scans.
  • columnar_analyze() is incremental after the initial bootstrap. Persistent metadata tracks row_count, chunk_count, dirty_count, and stats_valid. A no-op analyze on this 10M-row table now returns from metadata in 2.114 ms, and reanalyzing the dirty metadata after a single inserted row took 82.998 ms.
  • The current weak spots remain load time, columnar_analyze() cost, and larger on-disk size due to one SQLite B-tree per column plus zone-map metadata. The expensive analyze cost now applies primarily to the first bootstrap or to large dirty ranges.