HLD: Cheer Coin System (Virtual Currency for Peer Recognition) ​
Understanding the Problem ​
What is a Cheer Coin System? ​
A Cheer Coin system is a virtual currency platform for peer-to-peer recognition within an organization. Employees receive a monthly allocation of "cheer coins" that they can give to colleagues to recognize great work, helpfulness, or embodying company values. Recipients accumulate coins that appear on leaderboards and may be redeemable for rewards (gift cards, swag, donations). Think of it as a gamified "thank you" system -- Amazon has similar internal programs for culture reinforcement.
Functional Requirements ​
Core (above the line):
- Send coins -- a user can send N coins to another user with a message explaining why
- View balance -- users see their "giveable" balance (coins they can give) and "received" balance (coins they have earned)
- Leaderboard -- real-time ranking of top coin recipients (weekly/monthly/all-time)
- Coin allocation -- system automatically allocates N coins to each user at the start of each month
Below the line (mention but don't design):
- Reward redemption (exchange coins for gift cards)
- Manager dashboards and analytics
- Team-level leaderboards
- Integration with Slack/Teams for sending coins via chat
- Recognition feed (social wall showing recent cheers)
Non-Functional Requirements ​
- Transaction integrity -- coin transfers must be atomic (sender's balance decreases, recipient's balance increases, or neither happens)
- Low latency -- sending coins and viewing balances under 200ms
- Scale -- 500K employees, each getting 100 coins/month, peak sending during holidays/review season (10x normal load)
- Real-time leaderboards -- updated within 5 seconds of a coin transfer
- Auditability -- complete transaction history for compliance and dispute resolution
The Set Up ​
Core Entities ​
| Entity | Description |
|---|---|
| User | id, name, department, managerId, giveableBalance, receivedBalance |
| Transaction | id, senderId, recipientId, amount, message, category (teamwork/innovation/etc.), createdAt |
| Allocation | id, userId, amount, periodStart, periodEnd, allocatedAt |
| LeaderboardEntry | userId, period (week/month/all-time), totalReceived, rank |
API Design ​
Send coins:
POST /api/coins/send
Authorization: Bearer <token>
Request:
{
"recipientId": "user_456",
"amount": 10,
"message": "Thanks for helping debug the payment issue at midnight!",
"category": "teamwork",
"clientTransactionId": "uuid-from-client" // idempotency key
}
Response: 201 Created
{
"transactionId": "txn_abc123",
"senderGiveableBalance": 90,
"recipientReceivedBalance": 210,
"timestamp": "2024-01-15T10:30:00Z"
}Get balances:
GET /api/coins/balance
Authorization: Bearer <token>
Response: 200 OK
{
"giveableBalance": 90,
"receivedBalance": 210,
"monthlyAllocation": 100,
"giveableExpiry": "2024-02-01T00:00:00Z"
}Get leaderboard:
GET /api/leaderboard?period=monthly&limit=20
Authorization: Bearer <token>
Response: 200 OK
{
"period": "2024-01",
"entries": [
{ "rank": 1, "userId": "user_789", "name": "Jane Doe", "department": "Engineering", "received": 450 },
{ "rank": 2, "userId": "user_456", "name": "John Smith", "department": "Product", "received": 380 }
],
"myRank": 47,
"myReceived": 210
}Get transaction history:
GET /api/coins/transactions?type=sent&cursor=txn_abc100&limit=20
Authorization: Bearer <token>
Response: 200 OK
{
"transactions": [
{
"transactionId": "txn_abc123",
"type": "sent",
"recipientId": "user_456",
"recipientName": "John Smith",
"amount": 10,
"message": "Thanks for helping...",
"category": "teamwork",
"timestamp": "2024-01-15T10:30:00Z"
}
],
"nextCursor": "txn_abc080"
}High-Level Design ​
Flow 1: Sending Coins ​
[Client] --> [API Gateway] --> [Coin Service] --> [PostgreSQL (transactions, balances)]
| |
v v
[Redis (balance cache)] [Kafka (coin_transferred)]
|
[Leaderboard Service]
|
[Redis Sorted Sets (leaderboards)]Step-by-step:
- User A sends 10 coins to User B via
POST /api/coins/send - API Gateway authenticates the user and applies rate limiting (max 20 transfers/minute per user)
- Coin Service validates the request: a. Sender cannot be the same as recipient b. Amount > 0 and is a whole number c. Check sender's giveable balance (from Redis cache first, fallback to DB) d. Verify sender has enough coins:
giveableBalance >= amounte. Check idempotency key (clientTransactionId) -- if already processed, return the previous result - Execute the transfer in a database transaction:sql
BEGIN; -- Deduct from sender's giveable balance UPDATE users SET giveable_balance = giveable_balance - 10 WHERE id = 'user_A' AND giveable_balance >= 10; -- If no rows updated, insufficient balance -- ROLLBACK -- Add to recipient's received balance UPDATE users SET received_balance = received_balance + 10 WHERE id = 'user_B'; -- Record the transaction INSERT INTO transactions (id, sender_id, recipient_id, amount, message, category, created_at) VALUES ('txn_abc123', 'user_A', 'user_B', 10, 'Thanks...', 'teamwork', NOW()); -- Store idempotency key INSERT INTO idempotency_keys (client_id, transaction_id, created_at) VALUES ('uuid-from-client', 'txn_abc123', NOW()); COMMIT; - Update Redis caches:
- Decrement sender's cached giveable balance:
DECRBY balance:user_A:giveable 10 - Increment recipient's cached received balance:
INCRBY balance:user_B:received 10
- Decrement sender's cached giveable balance:
- Publish event to Kafka:
CoinTransferred { senderId, recipientId, amount, category, timestamp } - Return success response with updated balances
Flow 2: Monthly Coin Allocation ​
- Scheduler (EventBridge / cron) triggers on the 1st of each month at 00:00 UTC
- Scheduler publishes an
AllocationTriggeredevent to Kafka - Allocation Service consumes the event and processes allocations in batches:sql
-- Process 1000 users at a time UPDATE users SET giveable_balance = 100 WHERE id IN (SELECT id FROM users WHERE active = true LIMIT 1000 OFFSET :offset); INSERT INTO allocations (user_id, amount, period_start, period_end, allocated_at) SELECT id, 100, '2024-02-01', '2024-03-01', NOW() FROM users WHERE active = true LIMIT 1000 OFFSET :offset; - Process in batches of 1,000 users to avoid long-running transactions
- For each batch, invalidate Redis balance caches for those users
- After all batches complete, publish
AllocationCompleteevent
Handling the previous month's unspent coins: Before allocation, the service expires (zeros out) any remaining giveable balance from the previous month. Received balances are NOT expired -- those are the recipient's to keep.
Back-of-envelope: 500K users / 1,000 per batch = 500 batches. Each batch takes ~50ms (1000 updates). Total: 25 seconds. Acceptable for a monthly job.
Flow 3: Leaderboard Generation ​
- Leaderboard Service consumes
CoinTransferredevents from Kafka - For each event, update Redis sorted sets:
ZINCRBY leaderboard:monthly:2024-01 10 "user_B" ZINCRBY leaderboard:weekly:2024-W03 10 "user_B" ZINCRBY leaderboard:alltime 10 "user_B" - When a user requests the leaderboard:
ZREVRANGE leaderboard:monthly:2024-01 0 19 WITHSCORES -- top 20 ZREVRANK leaderboard:monthly:2024-01 "user_B" -- my rank ZSCORE leaderboard:monthly:2024-01 "user_B" -- my score - Hydrate user details (name, department) from a User Cache
Performance: Redis ZINCRBY is O(log N). ZREVRANGE for top 20 is O(log N + 20). With 500K users in the sorted set, log(500K) = ~19 operations. Sub-millisecond.
Flow 4: Viewing Balance ​
- Client calls
GET /api/coins/balance - Coin Service reads from Redis:
MGET balance:user_A:giveable balance:user_A:received - If cache miss: query PostgreSQL, populate Redis with TTL of 5 minutes
- Return balances with expiry info
Potential Deep Dives ​
Deep Dive 1: Coin Allocation and Budget Management ​
The Problem: The organization has 500K employees. Each gets 100 coins/month. But some departments might have different allocations. New hires mid-month should get prorated allocations. The total "coin supply" needs to be controlled for budgeting (if coins are redeemable for real rewards).
Bad Solution -- Fixed allocation, no budget tracking: Give everyone 100 coins. No tracking of total supply. If coins are redeemable, the finance team has no visibility into the total liability.
Good Solution -- Centralized allocation with departmental budgets:
Organization Budget: 50M coins/month
├── Engineering (2000 people): 200K coins (100 each)
├── Product (500 people): 75K coins (150 each -- higher allocation)
├── Sales (1000 people): 100K coins (100 each)
└── Remaining departments: ...Allocation Service checks the department budget before allocating:
def allocate_for_department(dept_id, period):
budget = get_department_budget(dept_id, period)
active_employees = get_active_employees(dept_id)
per_person = budget.total_coins // len(active_employees)
for batch in chunk(active_employees, 1000):
allocate_batch(batch, per_person, period)
budget.allocated += per_person * len(batch)
if budget.allocated > budget.total_coins:
alert("Over-allocation for department " + dept_id)Great Solution -- Event-sourced allocation ledger: Instead of mutable balances, maintain an immutable ledger of all allocation and transfer events:
Event Log:
[ALLOCATE] user_A +100 coins (Jan 2024 allocation)
[TRANSFER] user_A -> user_B: 10 coins
[TRANSFER] user_A -> user_C: 5 coins
[EXPIRE] user_A -85 coins (Jan 2024 unspent)
[ALLOCATE] user_A +100 coins (Feb 2024 allocation)Current balance = replay of all events. This is expensive to compute on every request, so we maintain a materialized view (the giveableBalance and receivedBalance columns) that is updated on each event. The ledger is the source of truth for auditing.
Pro-rated allocation for mid-month hires:
days_remaining = (period_end - hire_date).days
proration_factor = days_remaining / 30
allocation = math.floor(base_allocation * proration_factor)Deep Dive 2: Transaction Ledger Design ​
The Problem: We need a complete, immutable audit trail of every coin movement. This is critical for compliance, dispute resolution, and analytics.
Bad Solution -- Rely on mutable balance columns: If we only track giveableBalance and receivedBalance on the user table, we have no history. If a bug incorrectly deducts coins, we cannot trace what happened.
Good Solution -- Transactions table with indexes:
CREATE TABLE transactions (
id UUID PRIMARY KEY,
sender_id UUID NOT NULL,
recipient_id UUID NOT NULL,
amount INT NOT NULL CHECK (amount > 0),
message TEXT,
category VARCHAR(50),
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
-- Indexes for fast lookups
INDEX idx_sender (sender_id, created_at DESC),
INDEX idx_recipient (recipient_id, created_at DESC)
);This handles queries like "show me all coins I sent" and "show me all coins I received" efficiently.
Great Solution -- Double-entry bookkeeping: Every transfer creates two ledger entries:
CREATE TABLE ledger (
id BIGSERIAL PRIMARY KEY,
transaction_id UUID NOT NULL,
account_id UUID NOT NULL, -- the user
account_type VARCHAR(20) NOT NULL, -- 'giveable' or 'received'
entry_type VARCHAR(10) NOT NULL, -- 'debit' or 'credit'
amount INT NOT NULL,
balance_after INT NOT NULL, -- running balance
created_at TIMESTAMP NOT NULL
);For a transfer of 10 coins from A to B:
| transaction_id | account_id | account_type | entry_type | amount | balance_after |
|----------------|------------|--------------|------------|--------|---------------|
| txn_123 | user_A | giveable | debit | 10 | 90 |
| txn_123 | user_B | received | credit | 10 | 210 |Invariant: For any transaction, the sum of debits must equal the sum of credits. We can verify ledger integrity with:
SELECT transaction_id,
SUM(CASE WHEN entry_type = 'debit' THEN amount ELSE 0 END) as debits,
SUM(CASE WHEN entry_type = 'credit' THEN amount ELSE 0 END) as credits
FROM ledger
GROUP BY transaction_id
HAVING debits != credits; -- Should return 0 rowsThis is the gold standard for financial systems. It allows us to detect any inconsistency and rebuild balances from the ledger at any time.
Deep Dive 3: Leaderboard Scaling ​
The Problem: 500K users, leaderboard queried frequently (every time someone opens the recognition page). Must be real-time (within 5 seconds of a transfer).
Bad Solution -- Query PostgreSQL on every leaderboard request:
SELECT user_id, SUM(amount) as total_received
FROM transactions
WHERE recipient_id = user_id AND created_at >= '2024-01-01'
GROUP BY user_id
ORDER BY total_received DESC
LIMIT 20;Scans millions of rows. Even with indexes, this takes seconds. Not viable for a frequently accessed page.
Good Solution -- Redis sorted sets (what we use): As described in Flow 3. O(log N) updates, O(log N + K) reads for top K. Redis handles this effortlessly for 500K users.
Stale leaderboard on restart: If Redis crashes, we lose the sorted set. Rebuild from the transactions table on startup:
INSERT INTO redis_rebuild_queue
SELECT recipient_id, SUM(amount)
FROM transactions
WHERE created_at >= '2024-01-01'
GROUP BY recipient_id;With 500K entries, this takes ~10 seconds. During rebuild, serve a cached snapshot from the last periodic backup.
Great Solution -- Tiered leaderboards with materialized views:
For very high traffic:
- Real-time tier (Redis sorted set): Updated on every coin transfer. Serves live leaderboard requests.
- Periodic snapshot (PostgreSQL materialized view): Refreshed every hour. Used for historical leaderboards (last month, last quarter) and as a backup for Redis.
- Department/team leaderboards: Maintain separate Redis sorted sets per department. When a coin is transferred, update both the global and departmental leaderboards.
ZINCRBY leaderboard:monthly:2024-01 10 "user_B" -- global
ZINCRBY leaderboard:monthly:2024-01:eng 10 "user_B" -- engineering dept
ZINCRBY leaderboard:weekly:2024-W03:team_alpha 10 "user_B" -- team"My Rank" optimization: ZREVRANK is O(log N). For 500K users, that is fast. But if we had 50M users, we might use approximate ranking with percentiles instead.
Deep Dive 4: Fraud Prevention ​
The Problem: Users might game the system:
- Self-sending: creating alt accounts to send coins to themselves
- Coin rings: groups of users sending coins back and forth to inflate received counts
- Bot sending: automated scripts to send coins at scale
Detection strategies:
Rule-based (immediate):
def validate_transfer(sender, recipient, amount):
# Rule 1: Cannot send to self
if sender.id == recipient.id:
raise FraudError("Self-transfer not allowed")
# Rule 2: Same manager check (suspicious if too frequent)
if sender.manager_id == recipient.manager_id:
recent = get_transfers_between(sender.id, recipient.id, last_30_days)
if len(recent) > 10:
flag_for_review(sender.id, recipient.id, "Excessive same-team transfers")
# Rule 3: Reciprocal pattern detection
reverse_transfers = get_transfers_between(recipient.id, sender.id, last_30_days)
if sum(t.amount for t in reverse_transfers) > 50:
flag_for_review(sender.id, recipient.id, "Reciprocal transfer pattern")
# Rule 4: Rapid-fire sending
recent_sends = get_recent_sends(sender.id, last_5_minutes)
if len(recent_sends) > 10:
raise RateLimitError("Too many transfers in a short period")Graph-based (batch, daily): Build a transfer graph where nodes are users and edges are transfers (weighted by amount). Run cycle detection to find coin rings:
user_A -> user_B (50 coins)
user_B -> user_C (45 coins)
user_C -> user_A (40 coins)This cycle suggests a coin ring. Flag all participants for admin review.
ML-based (advanced): Train a model on features: transfer frequency, average amount, recipient diversity, time-of-day patterns. Label known fraud cases. The model scores each user's behavior and flags anomalies.
Deep Dive 5: Coin Expiry ​
The Problem: Giveable coins expire at the end of each month (use-it-or-lose-it policy). Received coins might also expire after 12 months (to prevent hoarding and ensure ongoing participation).
Giveable expiry (simple): During the monthly allocation job, simply overwrite the giveable balance:
UPDATE users SET giveable_balance = :newAllocation WHERE active = true;Any unspent coins from the previous month are implicitly expired. Record the expiry in the ledger for audit:
INSERT INTO ledger (account_id, account_type, entry_type, amount, balance_after)
VALUES ('user_A', 'giveable', 'debit', 85, 0); -- expired 85 unspent coinsReceived coin expiry (complex): If received coins expire after 12 months, we need to track when each coin was received. Two approaches:
Approach 1 -- Batch expiry job: Monthly job queries transactions older than 12 months and deducts those amounts from received balances:
SELECT recipient_id, SUM(amount) as expired_amount
FROM transactions
WHERE created_at < NOW() - INTERVAL '12 months'
AND NOT expired -- flag to prevent double-counting
GROUP BY recipient_id;Mark processed transactions as expired = true. Deduct from received balance.
Approach 2 -- FIFO queue per user: Think of each user's received balance as a queue of "coin batches," each with an expiry date. When coins are redeemed, consume from the oldest batch first (FIFO). Expired batches are removed.
User B's received coin batches:
[
{ "amount": 50, "receivedAt": "2023-02-15", "expiresAt": "2024-02-15" },
{ "amount": 30, "receivedAt": "2023-06-01", "expiresAt": "2024-06-01" },
{ "amount": 10, "receivedAt": "2024-01-15", "expiresAt": "2025-01-15" }
]
Total received balance: 90 (but 50 expire next month)This is more complex to implement but gives users visibility into when their coins expire, which is a better UX.
Deep Dive 6: Real-Time Balance Updates ​
The Problem: After a coin transfer, both the sender and recipient should see their updated balance immediately, even if they are viewing the page at that moment.
Bad Solution -- Polling: Client polls GET /api/coins/balance every 5 seconds. With 500K users, that is 100K requests/sec of mostly unchanged data.
Good Solution -- WebSocket per active user: When a user is on the recognition page, they open a WebSocket connection. The Coin Service publishes balance updates via Redis Pub/Sub:
Channel: balance_updates:user_B
Message: { "receivedBalance": 210, "from": "user_A", "amount": 10 }The WebSocket server subscribes to the user's channel and pushes updates.
Great Solution -- Server-Sent Events (SSE) with topic-based routing: SSE is simpler than WebSocket for unidirectional updates (server to client). Each user subscribes to their update stream:
GET /api/coins/stream
Accept: text/event-stream
data: {"type": "coin_received", "from": "Jane Doe", "amount": 10, "newBalance": 210}
data: {"type": "leaderboard_update", "myRank": 46, "myReceived": 210}Architecture:
- Coin Service publishes transfer event to Kafka
- Notification Worker consumes from Kafka
- Worker publishes to Redis Pub/Sub channels for both sender and recipient
- SSE Gateway servers subscribe to relevant channels and push to connected clients
At 500K employees, assume 10% are on the recognition page at any time = 50K SSE connections. At 10K connections per SSE Gateway instance = 5 instances needed. Very manageable.
What is Expected at Each Level ​
Mid-Level ​
- Design the basic send-coins flow with a relational database transaction
- Understand why atomicity matters (sender deducts and recipient receives, or neither)
- Basic leaderboard using a SQL query with ORDER BY
- Know that we need caching for balance reads
Senior ​
- Design the double-entry bookkeeping ledger for audit compliance
- Redis sorted sets for real-time leaderboards
- Monthly allocation as a batch job with pro-ration logic
- Fraud detection rules (self-sending, reciprocal patterns)
- Idempotency for duplicate transfer prevention
- Back-of-envelope for Redis memory and transaction throughput
- Event-driven architecture with Kafka for leaderboard updates
Staff+ ​
- Event-sourced ledger design with materialized views for balance
- Coin expiry with FIFO batch tracking
- Graph-based fraud detection for coin rings
- Departmental budget management and allocation governance
- Real-time balance updates via SSE/WebSocket with Redis Pub/Sub
- Compliance considerations: SOX-like audit requirements, data retention policies
- Multi-region: how to handle a global organization with employees in different time zones (when does the month "start"?)