Перейти к содержанию

Account Pool

Updated: Mar 2026 Status: Live in production (since 7 Feb 2026)


What Problem This Solves

Before the account pool, a new trial user had to wait 10-30 seconds after registration while the backend: 1. Created an account in MySQL 2. Called fill_configs to push XRAY configs to all 138+ servers 3. Waited for confirmation from each server

This caused poor UX on onboarding — users saw a loading spinner for half a minute.

Solution: Pre-create accounts in bulk during low-traffic hours. When a real user signs up, they instantly receive a pre-configured account that already has XRAY configs on every server.


How It Works

Account States

Accounts live in account.pay_status (MySQL enum). The pool adds one new state:

Status Meaning
POOL Pre-created, unclaimed. validTo = 2099-01-01. Configs already exist on all servers.
TRIAL Active trial user (claimed from pool or freshly created)
PAID Paid subscriber

Pool Lifecycle

AccountPoolScheduler (every 5 min)
  → COUNT(*) WHERE pay_status = 'POOL'
  → if count < min-size (100): fill pool up to target-size (500)
  → each fill: createAccount() → fill_configs on all servers → set pay_status = POOL, validTo = 2099

User signs up / makes trial payment
  → PaymentServiceImpl.claimFromPool()
      → SELECT ... FROM account WHERE pay_status = 'POOL' ORDER BY id LIMIT 1 FOR UPDATE
      → UPDATE: pay_status = TRIAL, deviceId = ..., validTo = now + trial_days
      → COMMIT
  → postCommitTaskService → vpnConfigServiceClient.fillAccount() (updates expiry on servers)
  → if pool empty: fallback to createAccount() (synchronous, slow path)

Atomicity

The claim operation uses SELECT FOR UPDATE to prevent race conditions when multiple requests try to claim the same account simultaneously. The lock is held only for the duration of the UPDATE — other claimers wait in queue and get the next POOL account.


Configuration

application-prod.properties on backend:

ACCOUNT_POOL_ENABLED=true
account.pool.min-size=100
account.pool.target-size=500

History: - 7 Feb 2026: initial values min=50, target=100 - 24 Feb 2026: increased to min=100, target=500 (onboarding volume grew)


Scheduler Details

Class: AccountPoolScheduler (Spring @Scheduled)

Interval: every 5 minutes

Logic: 1. Count SELECT COUNT(*) FROM account WHERE pay_status = 'POOL' 2. If count >= min-size → no action 3. If count < min-size → batch-create accounts until target-size is reached 4. Each created account goes through standard createAccount()fill_configs → mark as POOL

Post-claim fill: After a pool account is claimed, postCommitTaskService calls vpnConfigServiceClient.fillAccount(accountId) (POST /api/v1/sync/fill-account/{accountId}, 120s timeout). This updates the XRAY expiry time on servers — configs already exist, only the expiry changes.


All Claim Paths

Every user creation flow goes through claimFromPool() first:

Flow Entry point Pool claim? Fallback
Trial registration PaymentServiceImpl Yes createAccount()
Paid subscription PaymentServiceImpl Yes createAccount()
Admin-created account Admin API Yes createAccount()

Database Migration

V100: V100__account_pay_status_pool.sql — adds POOL value to account.pay_status enum.

Relevant columns: - account.pay_status — includes POOL, TRIAL, PAID, etc. - account.valid_to — POOL accounts have 2099-01-01 - Standard configs exist on all servers (same as any active account)


Admin API

Endpoint Method Description
/pool/status GET Returns current pool size, min, target
/pool/fill?count=20 POST Manually fill pool by N accounts

Use /pool/fill when pool size drops unexpectedly (e.g., after a traffic spike or a bug) without waiting for the scheduler.


Monitoring in Grafana

Three Prometheus gauges exposed by the backend:

Metric Type Current value
account_pool_size gauge Current number of unclaimed POOL accounts
account_pool_min gauge Configured minimum threshold (100)
account_pool_target gauge Configured target size (500)

Alert

Rule: AccountPoolLow Condition: account_pool_size < account_pool_min for > 10 minutes Severity: warning Contact: tg-vpn

Grafana Panels

  • Business Executive dashboard — stat panel: account_pool_size
  • Executive SLA dashboard — gauge: account_pool_size / account_pool_target * 100 (pool fill %)
  • API Security dashboard — timeseries: account_pool_size + account_pool_target

PromQL Queries

# Current pool size
account_pool_size

# Pool fill percentage
account_pool_size / account_pool_target * 100

# Is pool below minimum?
account_pool_size < account_pool_min

What to Do if Pool Depletes

  1. Check Grafana: Business Executive → Account Pool stat panel
  2. Check why pool dropped:
  3. Sudden traffic spike (many new users)
  4. fill_configs failures (VCS worker down)
  5. Scheduler bug
  6. Manually refill:
    curl -X POST -H "Authorization: Bearer $ADMIN_JWT" \
      https://api.shivavpn.io/pool/fill?count=200
    
  7. Monitor: pool should recover within 5 minutes (one scheduler tick)

If fill_configs is failing, fix the VCS worker first — otherwise filled accounts won't have configs on servers and the fallback path (createAccount()) will be used for new users anyway.


File Purpose
AccountPoolScheduler.java 5-min scheduler, fills pool
PaymentServiceImpl.java claimFromPool() — all claim paths
PostCommitTaskService.java Triggers fillAccount() after claim
VpnConfigServiceClient.java POST /api/v1/sync/fill-account/{id}
V100__account_pay_status_pool.sql DB migration: POOL enum value
archive/planning-implemented-2026-03/PLAN-ACCOUNT-POOL-PRECREATION.md Original implementation plan
docs/monitoring/METRICS-CATALOG.md Full metrics catalog including pool metrics

См. также: Архитектура бэкенда · Маппинг аккаунтов · VPN Config Service