Viz Platform
Self-hosted, privacy-first, cross-domain analytics and tag management. All services are Rust + Axum in a Cargo workspace monorepo. The front-office tracker is vanilla ES5 (bootstrapper) + ES6+ (modules). The back-office is Next.js 16 with App Router.
Platform overview
System architecture
Services
| Service | Port | Lang | Storage | Role |
|---|---|---|---|---|
| viz-config | 3002 | Rust | PostgreSQL | Config server, bundle serving, admin CRUD API |
| viz-collect | 3001 | Rust | Kafka | BID issuance, HMAC token, pageview ingest — hot path |
| viz-consumer | — | Rust | ClickHouse | Kafka → ClickHouse batch consumer, triggers MVs |
| viz-cmp | 3003 | Rust | PostgreSQL | Consent tier API, cookie issuance, consent records |
| viz-recs | 3005 | Rust | CH · Redis · PG | Popularity-based recommendations, signal storage |
| viz-analytics | 3004 | Rust | ClickHouse | Thin query layer over ClickHouse MVs, Bearer auth |
| viz-backoffice | 3000 | Next.js | — | React dashboard — sites, analytics, docs |
Tracker modules
Page owners embed one script tag: <script src="https://vizdata.io/viz/SITE_ID.js" async></script>. The config script injects window.__VIZ_CFG then async-loads the domain bundle (/b/:version/:sid.js), which contains all four modules concatenated.
viz.js
ES5
Config injection + boot.js. Builds cap flags, fires hit pixel, async-loads domain bundle.
cmp.js
ES5
Reads existing consent (localStorage / cookie). Shows banner if none. Dispatches viz:consent.
collect.js
ES5
Calls /init for BID + token. Tracks pageviews, active time, exit links. Queued until consent.
recs.js
ES6+
Polls for bid/token. Fetches /recs/items. Dispatches viz:recs for page integration.
Tracker boot sequence
Data pipeline
Kafka topic
viz.traffic.raw — 12 partitions, 48h retention, murmur2(site_id) partition key
ClickHouse write
viz-consumer batches 1000 events or 5s — single ClickHouse node handles 1–3M rows/sec
MV aggregation
mv_pageviews_hourly (SummingMergeTree, per hour/path) + mv_referrers_daily (per day/medium/source)
Token scheme
Formula
token = HMAC-SHA256( key = site_secret_key, data = bid + ":" + tid + ":" + floor(ts / 300) ) // 5-minute sliding window // current + previous window accepted (clock drift)
Why server-issued: The legacy system used RC4(hostname + timestamp) — entirely client-side and trivially spoofable. Viz replaces it with a server-issued HMAC where the key never leaves the server.
Replay protection: The 5-minute time window limits replay to 10 minutes maximum. Token changes every window; browsers silently refresh via /init before expiry.
Hot path cost: HMAC-SHA256 recompute is ~1µs pure CPU — no I/O, no cache lookup. Collect stays stateless.
Consent tiers (CMP)
Hit pixel only. No BID, no session, no Collect, no Recs.
Full Collect (pageview + session). BID set server-side. No Recs.
Collect + Recs. User profile built for recommendations.
All of the above + cross-site data sharing if in a network.
Consent state persisted as: Set-Cookie: cmp_tier=N (server-side, HttpOnly) + localStorage __vtm_cmp fallback.
On consent change: POST /cmp/consent { tid, bid, tier } — server sets cookie and returns a fresh HMAC token.
Storage
- –traffic table — MergeTree, PARTITION BY toYYYYMM(ts), ORDER BY (site_id, toDate(ts), event_type, path)
- –mv_pageviews_hourly — SummingMergeTree, GROUP BY site_id/hour/path
- –mv_referrers_daily — SummingMergeTree, GROUP BY site_id/day/medium/source
- –LowCardinality(String) for repeated values → 5–10× compression
- –sites — domain configs, features, cmp_config, recs_config (JSONB), secret_key
- –consent_records — (bid, tid) PK, tier, consented_at
- –schema_migrations — tracked by viz-migrate Kubernetes Job
- –In-process DashMap cache in each service (5-min TTL)
- –viz.traffic.raw — 12 partitions, RF=3, 48h retention
- –Partition key: murmur2(site_id) — events per site stay ordered
- –810 KB/s avg · 4 MB/s peak — 100× headroom per partition
- –Versioned envelope: { v: 1, ts, data } — forward-compatible
- –recs:vis:{tid}:{bid} — SADD visited paths (7-day TTL)
- –recs:clicks:{tid}:{bid} — ZADD click signals, score=ts (30-day TTL)
- –recs:impressions:{tid}:{bid} — ZADD impression signals (30-day TTL)
- –Used by viz-recs for popularity filtering and future ML training data
Port reference
| Port | Service | Key endpoints |
|---|---|---|
| 3000 | viz-backoffice | Next.js — /sites, /docs |
| 3001 | viz-collect | GET /init · POST /collect · GET /hit |
| 3002 | viz-config | GET /viz/:sid.js · GET /b/:v/:sid.js · /admin/sites CRUD |
| 3003 | viz-cmp | POST /cmp/consent · GET /cmp/status |
| 3004 | viz-analytics | GET /analytics/overview · /timeseries · /pages · /referrers |
| 3005 | viz-recs | GET /recs/items · POST /recs/signal |
| 5432 | PostgreSQL | viz_dev (local) |
| 8123 | ClickHouse | HTTP interface |
| 6379 | Redis | default |
| 19092 | Redpanda | Kafka protocol |
| 18080 | Redpanda Console | Web UI |