Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/vemetric/vemetric/llms.txt

Use this file to discover all available pages before exploring further.

Architecture

Vemetric is built as a modern, scalable microservices architecture using a TypeScript monorepo. This page explains how all the components work together.

System Overview

Core Services

App Service

Port: 4000 | Technology: Vite + React + Hono The main application providing the web dashboard and API.
  • Framework: React 18 with TypeScript
  • Routing: TanStack Router
  • State: Valtio for reactive state management
  • UI: Chakra UI v3 with Ark UI components
  • Data Fetching: TanStack Query + tRPC
  • Charts: Recharts for analytics visualizations
  • Build: Vite for fast HMR and optimized production builds
The frontend is a single-page application that communicates with the backend via tRPC and REST APIs.
  • Framework: Hono for lightweight, fast HTTP routing
  • API: tRPC for type-safe API communication
  • Authentication: Better Auth for session management
  • ORM: Prisma for PostgreSQL interactions
  • Logging: Pino structured logging
Key responsibilities:
  • User authentication and session management
  • Project and organization CRUD operations
  • Analytics queries to ClickHouse
  • API key management
  • Funnel and custom event configuration
  • Avatar upload to S3 (optional)
Key Features:
  • Real-time dashboard updates
  • Project and organization management
  • User invitations and access control
  • Analytics visualization (page views, events, funnels)
  • API key generation
  • Public dashboard sharing

Hub Service

Port: 4004 | Technology: Hono The event ingestion service that receives analytics events from websites and applications.
  1. Receive Event: Accept POST requests from Vemetric SDK
  2. Validate Project: Authenticate via project token
  3. Extract Metadata: Parse user agent, IP geolocation, referrer
  4. Bot Detection: Filter out known bots and crawlers
  5. User Identification: Generate or retrieve user ID from cookie
  6. Queue Jobs: Push events to BullMQ queues for processing
  7. Return Response: Send acknowledgment to client
  • Page Views: Automatic page view tracking
  • Page Leaves: Track when users leave pages
  • Custom Events: User-defined events with custom data
  • User Identification: Associate events with user identifiers
Key Features:
  • High-throughput event ingestion
  • IP-based geolocation (country, city)
  • User agent parsing (browser, OS, device type)
  • Bot filtering with extensive bot list
  • Cookie-based user tracking
  • CORS support for cross-origin requests
  • Request prefetch detection

Worker Service

Port: None (background) | Technology: BullMQ Background job processor that handles data processing, aggregation, and notifications. Queue Workers:

Event Worker

Processes raw events and writes to ClickHouse event table.

Session Worker

Aggregates events into sessions with start/end times and duration.

User Worker

Creates and updates user records in ClickHouse.

Device Worker

Tracks unique devices based on user agent fingerprints.

Email Worker

Sends transactional emails and drip campaigns via Postmark.

First Event Worker

Handles special processing for a project’s first event.

User Enrichment Worker

Enriches user data with additional metadata.

Merge User Worker

Merges user records when identifiers are linked.

Salt Rotation Worker

Periodically rotates cryptographic salts for user ID hashing.
Error Handling:
  • Failed jobs are stored in PostgreSQL failed_queue_job table
  • Configurable retry policies with exponential backoff
  • Dead letter queue for permanently failed jobs

BullBoard UI

Port: 4100 | Technology: Bull Board + Hono Web UI for monitoring and managing BullMQ job queues. Features:
  • Real-time queue statistics
  • Job status monitoring (active, completed, failed)
  • Job retry and deletion controls
  • Queue pause/resume controls
  • Job details and error logs
Authentication: Basic auth using BULLBOARD_USERNAME and BULLBOARD_PASSWORD
BullBoard provides administrative access to queues. Always secure it with strong credentials and restrict network access in production.

Data Storage

PostgreSQL (Primary Database)

Technology: PostgreSQL 17 + Prisma ORM Schema:
  • user: User accounts with email and profile data
  • session: Active user sessions (Better Auth)
  • account: OAuth provider accounts
  • verification: Email verification tokens
  • organization: Multi-tenant organizations
  • project: Analytics projects with tracking tokens
  • user_organization: User-organization relationships with roles
  • user_project_access: Fine-grained project access control
  • invitation: Pending organization invitations
  • invitation_project_access: Project access for pending invitations
  • api_key: API keys for programmatic access
  • funnel: Funnel definitions with steps
  • user_identification_map: Maps user IDs to custom identifiers
  • salt: Cryptographic salts for user ID hashing
  • billing_info: Paddle subscription data
  • Organization featureFlags: JSON feature flags
  • Organization customPlanEvents: Custom event limits
  • email_drip_sequence: Email drip campaign state
  • email_drip_history: Email delivery tracking
  • failed_queue_job: Failed background jobs for debugging
Migrations: Managed by Prisma Migrate
# Create new migration
bun run db:migrate

# Apply migrations
bun run db:deploy

ClickHouse (Analytics Database)

Technology: ClickHouse 23.10 Schema:
Stores all analytics events with rich metadata.Engine: CollapsingMergeTree(sign) - Supports event updates/deletionsPartitioning: By month (toYYYYMM(createdAt))Key Columns:
  • projectId, userId, sessionId, deviceId
  • name, isPageView, isPageLeave
  • createdAt (DateTime64 with millisecond precision)
  • Device info: osName, clientName, deviceType
  • Location: countryCode, city
  • UTM parameters: utmSource, utmMedium, utmCampaign
  • userIdentifier, userDisplayName
  • customData (JSON string)
  • headers (for custom event metadata)
Migrations: Managed by clickhouse-migrations
# Run migrations
cd packages/clickhouse
bun run migrate-local
ClickHouse uses specialized table engines optimized for analytics:
  • CollapsingMergeTree: Efficient updates via sign column (+1 insert, -1 delete)
  • ReplacingMergeTree: Automatic deduplication based on primary key

Redis (Cache & Queue)

Technology: Redis 7 with persistence Usage:

BullMQ Queues

  • Event processing queues
  • Session aggregation queues
  • Email delivery queues
  • User data update queues

Locking

  • User identification locks
  • Distributed locks for race condition prevention

Caching

  • Project configuration cache
  • User session cache

Rate Limiting

  • API rate limiting (future)
  • Event ingestion throttling (future)
Configuration:
docker-compose.yml
redis:
  image: redis:7-alpine
  command: redis-server --appendonly yes
  volumes:
    - redis_data:/data
Persistence is enabled via AOF (Append-Only File) to prevent data loss.

Data Flow

Event Ingestion Flow

1

SDK Sends Event

Vemetric SDK (@vemetric/react, @vemetric/node) sends event to Hub:
POST http://hub.yourdomain.com/event
{
  "name": "page_view",
  "url": "https://example.com/page",
  "referrer": "https://google.com"
}
2

Hub Processes Request

Hub service:
  1. Validates project token
  2. Checks for bot traffic
  3. Extracts IP geolocation
  4. Parses user agent
  5. Gets/sets user ID cookie
3

Queue Jobs

Hub pushes jobs to Redis queues:
  • event-queue: Process and store event
  • session-queue: Update session data
  • user-queue: Update user data
  • device-queue: Track device if new
4

Worker Processes Jobs

Worker service picks up jobs and:
  • Writes event to ClickHouse event table
  • Aggregates into session table
  • Creates/updates user records
  • Tracks unique devices
5

Dashboard Queries Data

App service queries ClickHouse for analytics:
SELECT 
  toDate(createdAt) as date,
  count() as views
FROM event
WHERE projectId = ?
  AND isPageView = 1
GROUP BY date

User Identification Flow

1

Anonymous User

First visit: Hub generates random userId and sets cookie
2

User Identifies

Application calls identify() with custom identifier:
vemetric.identify({
  identifier: 'user@example.com',
  displayName: 'John Doe'
})
3

Link Identifier

Hub stores mapping in PostgreSQL:
user_identification_map:
  projectId: 123
  userId: abc123 (hashed)
  identifier: user@example.com (hashed)
4

Merge History

Worker merges all events from anonymous userId to identified user

Monorepo Structure

Vemetric uses Turborepo for efficient monorepo management.
vemetric/
├── apps/
│   ├── app/              # Main web app (Vite + React + Hono API)
│   ├── hub/              # Event ingestion service
│   ├── worker/           # Background job processor
│   ├── bullboard/        # Queue monitoring UI
│   ├── health-check/     # Health check service
│   ├── dev-proxy/        # Development reverse proxy
│   └── e2e/              # Playwright end-to-end tests
├── packages/
│   ├── database/         # Prisma schema and PostgreSQL client
│   ├── clickhouse/       # ClickHouse client and migrations
│   ├── queues/           # BullMQ queue definitions
│   ├── common/           # Shared utilities and types
│   ├── logger/           # Pino logging configuration
│   ├── email/            # Email templates and sender
│   ├── eslint-config/    # Shared ESLint configuration
│   └── tsconfig/         # Shared TypeScript configurations
├── turbo.json            # Turborepo pipeline configuration
├── package.json          # Root package with workspace scripts
├── docker-compose.yml    # Local infrastructure services
└── .env.example          # Environment variable template

Build Pipeline

turbo.json
{
  "tasks": {
    "build": {
      "dependsOn": ["^db:generate", "^build"]
    },
    "dev": {
      "dependsOn": ["^db:generate"],
      "cache": false,
      "persistent": true
    }
  }
}
Turborepo ensures:
  • Database clients are generated before building apps
  • Dependencies build before dependents
  • Parallel execution where possible
  • Caching for faster builds

Scaling Considerations

Horizontal Scaling

App Service

Stateless - scale horizontally behind load balancer. Sessions stored in Redis.

Hub Service

Stateless - scale horizontally for high event throughput.

Worker Service

Scale based on queue depth. Multiple workers process jobs concurrently.

BullBoard

Single instance sufficient. Read-only access to Redis.

Database Scaling

  • Read replicas for analytics queries
  • Connection pooling (PgBouncer)
  • Periodic vacuum and analyze
  • Sharding for multi-billion event datasets
  • Replication for high availability
  • Partitioning by month (already configured)
  • TTL policies for data retention
  • Redis Cluster for horizontal scaling
  • Redis Sentinel for high availability
  • Persistence: AOF + RDB snapshots

Monitoring Points

Queue Depth

Monitor BullMQ queue sizes. High depth indicates need for more workers.

ClickHouse Query Time

Track query performance. Optimize slow queries with materialized views.

Event Ingestion Rate

Monitor events/second through Hub. Scale Hub replicas as needed.

Database Connections

Track PostgreSQL connection pool usage. Add pooler if saturated.

Redis Memory

Monitor Redis memory usage. Scale or add eviction policies.

Disk Usage

Track ClickHouse data size. Implement TTL or archival policies.

Next Steps

Database Setup

Set up PostgreSQL and ClickHouse databases

Monitoring

Monitor your Vemetric deployment