Skip to content

Scaling

можно. is designed for horizontal scaling. Every server instance is stateless — all persistent state lives in PostgreSQL. Add more replicas to handle more traffic.

Stateless Architecture

graph TD
    LB[Load Balancer] --> S1[Mozhno Instance 1]
    LB --> S2[Mozhno Instance 2]
    LB --> S3[Mozhno Instance N]
    S1 --> DB[(PostgreSQL)]
    S2 --> DB
    S3 --> DB
    S1 --> C1[(Caffeine Cache)]
    S2 --> C2[(Caffeine Cache)]
    S3 --> C3[(Caffeine Cache)]

Each instance:

  • Maintains its own local Caffeine cache (in-memory, no shared state)
  • Connects to the same PostgreSQL database
  • Has no affinity or sticky sessions

Load Balancing

No Sticky Sessions Required

Authentication uses JWT (HMAC-SHA256). The token contains all necessary claims encoded in the token itself. Any instance can validate the token without querying a shared session store or the issuing instance. This means:

  • Any load balancer algorithm works (round-robin, least-connections, random)
  • No session affinity cookies needed
  • An instance can be terminated without losing user sessions

Load Balancer Configuration

SettingRecommendation
AlgorithmLeast connections (least_conn in nginx)
Health checkGET /actuator/health/readiness every 10 s
Timeout30 s connect, 60 s read
Keep-aliveEnable for reduced connection overhead

nginx Example

nginx
upstream mozhno_backend {
    least_conn;
    server mozhno-1:8080 max_fails=3 fail_timeout=30s;
    server mozhno-2:8080 max_fails=3 fail_timeout=30s;
    keepalive 32;
}

server {
    listen 443 ssl;
    server_name mozhno.example.com;

    location / {
        proxy_pass http://mozhno_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 60s;
    }

    location /actuator/health {
        proxy_pass http://mozhno_backend;
    }
}

Caching

можно. uses Caffeine — a local in-memory cache within a single JVM. No Redis, no distributed cache required.

What Is Cached

CacheData StoredInvalidation
clientFlagsGET /api/client/features response for SDKs@CacheEvict on any flag, segment, strategy, or context change
flagsFlag queries (admin panel)@CacheEvict on flag create/update/delete
segmentsSegment queries@CacheEvict on segment create/update/delete
projectsProject list@CacheEvict on project create/update/delete
tagsTag list@CacheEvict on tag create/update/delete
contextDefinitionsContext definitions@CacheEvict on context create/update/delete

All caches share a single TTLCACHE_TTL_MINUTES (default 5 minutes). Maximum size: 5000 entries per cache.

How Invalidation Works

When a flag is changed via REST API:

POST /api/v1/flags/42 → @CacheEvict(allEntries = true) → clientFlags cache cleared

But only on the instance that handled the request. Other instances learn about the change via TTL.

Multi-Node Nuance

graph LR
    Admin -->|POST /api/v1/flags/42| LB
    LB -->|request lands on| S1[Instance 1]
    S1 -->|@CacheEvict<br/>locally| C1[Caffeine ✓ cleared]
    S1 --> PG[(PostgreSQL)]
    
    S2[Instance 2] -->|cache NOT cleared<br/>waits for TTL| C2[Caffeine ✗ stale]
    
    SDK -->|GET /api/client/features| S2
    S2 -->|returns stale rules| SDK

Instance 1: cache cleared instantly. Instance 2: cache stale until TTL expiry (up to 5 minutes).

This is not a bug — it's expected behavior for a local cache. Feature flags do not require real-time consistency. A few minutes of staleness is acceptable for gradual rollouts.

Recommendations

ScenarioCACHE_TTL_MINUTESWhy
1 instance5 (default)Cache cleared instantly on changes
Multi-node1 or 0Minimize inconsistency window between instances. 0 = cache disabled
Enterprise5 + RedisAdd spring-boot-starter-data-redis, switch CACHE_TYPE to redis, configure SPRING_DATA_REDIS_*. Invalidation via Redis Pub/Sub — instant across all instances

Connection Pool Sizing

As you scale horizontally, adjust HikariCP's maximum-pool-size to avoid overwhelming PostgreSQL:

InstancesPool Size per InstanceTotal ConnectionsPostgreSQL max_connections
1303040
2306070
4156070
886480

Formula:

pool_size = min(30, floor(max_connections / instances) - 2)

Set via environment variable:

bash
HIKARI_MAX_POOL_SIZE=15
HIKARI_MIN_IDLE=3

Performance Characteristics

Request Profile

можно. is a read-heavy workload:

OperationRatioTypical Latency
SDK flag sync (read)~80%5–20 ms
Dashboard API (read)~15%10–50 ms
Flag write/update (write)~5%20–100 ms

Throughput

Benchmarks on a 2 vCPU / 2 GB instance, PostgreSQL on the same network:

EndpointRequests/sec
GET /api/client/features (100 flags)~8,000
GET /api/v1/flags (dashboard)~2,000
POST /api/v1/flags (create)~500
POST /api/v1/auth/login~1,000

Linear scaling: 4 instances ≈ 4× throughput (bottleneck shifts to PostgreSQL at high instance counts).

Bottleneck Analysis

ScalePrimary BottleneckMitigation
1–4 instancesApplication CPUScale horizontally
4–8 instancesPostgreSQL connectionsReduce pool size, add read replicas
8–16 instancesPostgreSQL I/ORead replicas, connection pooling (PgBouncer)
16+ instancesPostgreSQL writesPartitioned tables, async writes (Enterprise)

JVM Tuning

For consistent performance under load:

bash
JAVA_TOOL_OPTIONS="
  -XX:+UseZGC
  -XX:MaxRAMPercentage=75
  -XX:+ExitOnOutOfMemoryError
  -XX:ConcGCThreads=2
  -XX:ParallelGCThreads=2
  -XX:ZCollectionInterval=30
  -Djava.net.preferIPv4Stack=true
"

ZGC provides sub-millisecond pause times regardless of heap size. It's well-suited for latency-sensitive HTTP APIs where even a 50ms GC pause would cause request timeouts.

Vertical vs Horizontal

ApproachWhen to UseLimits
Vertical (bigger instance)Single node, < 100 req/sCPU sockets, memory slots
Horizontal (more instances)> 100 req/s, HA requiredPostgreSQL becomes bottleneck
BothHigh throughput + headroomBudget

Start vertical, scale horizontally when you need high availability or exceed a single instance's capacity.

Monitoring Scaling Behavior

Key metrics to watch:

MetricSourceAction When
CPU usage/actuator/metrics/system.cpu.usage> 70% sustained → scale up
Heap memory/actuator/metrics/jvm.memory.used> 75% limit → scale up or increase limit
DB connection pool active/actuator/metrics/hikaricp.connections.activeApproaching max → increase pool or add instances
HTTP 503 responsesAccess logReadiness probe failing → check DB
Request latency p99/actuator/metrics/http.server.requests> 200 ms → investigate bottleneck
  • Database — backups, replication, configuration
  • Database — Connection pool, backups
  • Docker — Single-node deployment

Released under the AGPL v3.0 License.