Documentation Index
Fetch the complete documentation index at: https://mintlify.com/vemetric/vemetric/llms.txt
Use this file to discover all available pages before exploring further.
Architecture
Vemetric is built as a modern, scalable microservices architecture using a TypeScript monorepo. This page explains how all the components work together.System Overview
Core Services
App Service
Port: 4000 | Technology: Vite + React + Hono The main application providing the web dashboard and API.Frontend (Vite SPA)
Frontend (Vite SPA)
- Framework: React 18 with TypeScript
- Routing: TanStack Router
- State: Valtio for reactive state management
- UI: Chakra UI v3 with Ark UI components
- Data Fetching: TanStack Query + tRPC
- Charts: Recharts for analytics visualizations
- Build: Vite for fast HMR and optimized production builds
Backend (Hono API)
Backend (Hono API)
- Framework: Hono for lightweight, fast HTTP routing
- API: tRPC for type-safe API communication
- Authentication: Better Auth for session management
- ORM: Prisma for PostgreSQL interactions
- Logging: Pino structured logging
- User authentication and session management
- Project and organization CRUD operations
- Analytics queries to ClickHouse
- API key management
- Funnel and custom event configuration
- Avatar upload to S3 (optional)
- Real-time dashboard updates
- Project and organization management
- User invitations and access control
- Analytics visualization (page views, events, funnels)
- API key generation
- Public dashboard sharing
Hub Service
Port: 4004 | Technology: Hono The event ingestion service that receives analytics events from websites and applications.Event Processing Pipeline
Event Processing Pipeline
- Receive Event: Accept POST requests from Vemetric SDK
- Validate Project: Authenticate via project token
- Extract Metadata: Parse user agent, IP geolocation, referrer
- Bot Detection: Filter out known bots and crawlers
- User Identification: Generate or retrieve user ID from cookie
- Queue Jobs: Push events to BullMQ queues for processing
- Return Response: Send acknowledgment to client
Supported Events
Supported Events
- Page Views: Automatic page view tracking
- Page Leaves: Track when users leave pages
- Custom Events: User-defined events with custom data
- User Identification: Associate events with user identifiers
- High-throughput event ingestion
- IP-based geolocation (country, city)
- User agent parsing (browser, OS, device type)
- Bot filtering with extensive bot list
- Cookie-based user tracking
- CORS support for cross-origin requests
- Request prefetch detection
Worker Service
Port: None (background) | Technology: BullMQ Background job processor that handles data processing, aggregation, and notifications. Queue Workers:Event Worker
Processes raw events and writes to ClickHouse event table.
Session Worker
Aggregates events into sessions with start/end times and duration.
User Worker
Creates and updates user records in ClickHouse.
Device Worker
Tracks unique devices based on user agent fingerprints.
Email Worker
Sends transactional emails and drip campaigns via Postmark.
First Event Worker
Handles special processing for a project’s first event.
User Enrichment Worker
Enriches user data with additional metadata.
Merge User Worker
Merges user records when identifiers are linked.
Salt Rotation Worker
Periodically rotates cryptographic salts for user ID hashing.
- Failed jobs are stored in PostgreSQL
failed_queue_jobtable - Configurable retry policies with exponential backoff
- Dead letter queue for permanently failed jobs
BullBoard UI
Port: 4100 | Technology: Bull Board + Hono Web UI for monitoring and managing BullMQ job queues. Features:- Real-time queue statistics
- Job status monitoring (active, completed, failed)
- Job retry and deletion controls
- Queue pause/resume controls
- Job details and error logs
BULLBOARD_USERNAME and BULLBOARD_PASSWORD
Data Storage
PostgreSQL (Primary Database)
Technology: PostgreSQL 17 + Prisma ORM Schema:User & Authentication
User & Authentication
user: User accounts with email and profile datasession: Active user sessions (Better Auth)account: OAuth provider accountsverification: Email verification tokens
Organizations & Projects
Organizations & Projects
organization: Multi-tenant organizationsproject: Analytics projects with tracking tokensuser_organization: User-organization relationships with rolesuser_project_access: Fine-grained project access controlinvitation: Pending organization invitationsinvitation_project_access: Project access for pending invitations
Configuration
Configuration
api_key: API keys for programmatic accessfunnel: Funnel definitions with stepsuser_identification_map: Maps user IDs to custom identifierssalt: Cryptographic salts for user ID hashing
Billing & Features
Billing & Features
billing_info: Paddle subscription data- Organization
featureFlags: JSON feature flags - Organization
customPlanEvents: Custom event limits
Email & Jobs
Email & Jobs
email_drip_sequence: Email drip campaign stateemail_drip_history: Email delivery trackingfailed_queue_job: Failed background jobs for debugging
ClickHouse (Analytics Database)
Technology: ClickHouse 23.10 Schema:- event Table
- session Table
- device Table
Stores all analytics events with rich metadata.Engine:
CollapsingMergeTree(sign) - Supports event updates/deletionsPartitioning: By month (toYYYYMM(createdAt))Key Columns:projectId,userId,sessionId,deviceIdname,isPageView,isPageLeavecreatedAt(DateTime64 with millisecond precision)- Device info:
osName,clientName,deviceType - Location:
countryCode,city - UTM parameters:
utmSource,utmMedium,utmCampaign userIdentifier,userDisplayNamecustomData(JSON string)headers(for custom event metadata)
clickhouse-migrations
ClickHouse uses specialized table engines optimized for analytics:
- CollapsingMergeTree: Efficient updates via sign column (+1 insert, -1 delete)
- ReplacingMergeTree: Automatic deduplication based on primary key
Redis (Cache & Queue)
Technology: Redis 7 with persistence Usage:BullMQ Queues
- Event processing queues
- Session aggregation queues
- Email delivery queues
- User data update queues
Locking
- User identification locks
- Distributed locks for race condition prevention
Caching
- Project configuration cache
- User session cache
Rate Limiting
- API rate limiting (future)
- Event ingestion throttling (future)
docker-compose.yml
Data Flow
Event Ingestion Flow
Hub Processes Request
Hub service:
- Validates project token
- Checks for bot traffic
- Extracts IP geolocation
- Parses user agent
- Gets/sets user ID cookie
Queue Jobs
Hub pushes jobs to Redis queues:
event-queue: Process and store eventsession-queue: Update session datauser-queue: Update user datadevice-queue: Track device if new
Worker Processes Jobs
Worker service picks up jobs and:
- Writes event to ClickHouse
eventtable - Aggregates into
sessiontable - Creates/updates user records
- Tracks unique devices
User Identification Flow
Monorepo Structure
Vemetric uses Turborepo for efficient monorepo management.Build Pipeline
turbo.json
- Database clients are generated before building apps
- Dependencies build before dependents
- Parallel execution where possible
- Caching for faster builds
Scaling Considerations
Horizontal Scaling
App Service
Stateless - scale horizontally behind load balancer. Sessions stored in Redis.
Hub Service
Stateless - scale horizontally for high event throughput.
Worker Service
Scale based on queue depth. Multiple workers process jobs concurrently.
BullBoard
Single instance sufficient. Read-only access to Redis.
Database Scaling
PostgreSQL
PostgreSQL
- Read replicas for analytics queries
- Connection pooling (PgBouncer)
- Periodic vacuum and analyze
ClickHouse
ClickHouse
- Sharding for multi-billion event datasets
- Replication for high availability
- Partitioning by month (already configured)
- TTL policies for data retention
Redis
Redis
- Redis Cluster for horizontal scaling
- Redis Sentinel for high availability
- Persistence: AOF + RDB snapshots
Monitoring Points
Queue Depth
Monitor BullMQ queue sizes. High depth indicates need for more workers.
ClickHouse Query Time
Track query performance. Optimize slow queries with materialized views.
Event Ingestion Rate
Monitor events/second through Hub. Scale Hub replicas as needed.
Database Connections
Track PostgreSQL connection pool usage. Add pooler if saturated.
Redis Memory
Monitor Redis memory usage. Scale or add eviction policies.
Disk Usage
Track ClickHouse data size. Implement TTL or archival policies.
Next Steps
Database Setup
Set up PostgreSQL and ClickHouse databases
Monitoring
Monitor your Vemetric deployment