
Amazon Interview Question Designing a shopping cart that supports millions of concurrent users
TL;DR
- DynamoDB as the authoritative cart store, Redis for low-latency reads.
- Idempotent mutations + conditional writes + DynamoDB transactions to prevent duplicates and lost updates.
- Guest → user merge is a single atomic transaction; no global locks.
- Pricing snapshots are for display only; re-price at checkout.
- Short TTLs for guest carts; signed-in carts never expire.
- EventBridge broadcasts cart changes for analytics and personalization.
Architecture Overview
- Edge: Route 53 → CloudFront → WAF → S3 (SPA)
- Auth: Amazon Cognito (user pools). Anonymous users hold a long-lived
deviceIdcookie. - API: API Gateway (HTTP) → ECS/Fargate or Lambda (
cart-svc). - State: DynamoDB (source of truth)
- Cache: ElastiCache Redis (hot reads + pub/sub invalidations)
- Events: DynamoDB Streams → EventBridge (e.g.,
CartUpdated,CartMerged) - Observability: CloudWatch + X-Ray + OTEL traces; canaries.
flowchart LR
subgraph Edge
CF[CloudFront] --> WAF[WAF]
WAF --> UI[S3 Web App]
end
CF --> APIGW[API Gateway]
APIGW --> CART[cart-svc (ECS/Lambda)]
CART --> DDB[(DynamoDB)]
CART --> REDIS[(ElastiCache Redis)]
DDB -- Streams --> EVB[(EventBridge)]
CART --> XRay[X-Ray/OTEL]
Data Model (Single-Table DynamoDB)
Partition cart state by cartId (high cardinality, uniform distribution). Maintain reverse pointers for quick lookup by user/device.
PK SK Type Attributes
CART#<cartId> META Cart userId?, deviceId?, currency, createdAt, updatedAt, ttl?
CART#<cartId> ITEM#<sku>#<key> CartItem qty, priceSnapshot{amount,currency}, attrs{size,color},
promoCodes[], updatedAt, source, idemKey?
USER#<userId> ACTIVE_CART UserIndex cartId, updatedAt
DEVICE#<deviceId> ACTIVE_CART DeviceIdx cartId, updatedAt
Notes
ttlonly on guest carts (e.g., 30 days). Signed-in carts do not expire.ITEM#<sku>#<key>allows multiple distinct lines of the same SKU (e.g., different color/size). If not needed, omit#<key>.- GSI1:
GSI1PK = USER#<userId>,GSI1SK = ACTIVE_CARTfor O(1) user→cart lookup. - GSI2:
GSI2PK = DEVICE#<deviceId>,GSI2SK = ACTIVE_CARTfor guest cart recovery.
API Surface (Idempotent by Default)
All mutating endpoints accept Idempotency-Key (UUID). The service stores the key per cart and returns the prior result if replayed.
POST /carts -> { cartId } # create guest cart
GET /carts/{cartId} -> { items[], currency, ... }
POST /carts/{cartId}/items -> {sku, qty, attrs?, priceSnapshot?, source?}
PATCH /carts/{cartId}/items/{itemId} -> {qty, promoCodes?}
DELETE /carts/{cartId}/items/{itemId}
# Auth-required
POST /carts/attach -> {cartId} # attach guest cart to signed-in user
POST /carts/merge -> {guestCartId}
GET /me/cart -> current cart for principal
Common Errors
409 CONFLICT: conditional write failed (stale merge / race); retry with fresh reads.409 IDEMPOTENT_REPLAY: returning previous success payload.422 UNPROCESSABLE: invalid payload (e.g., negativeqty, malformedattrs).
Core Flows
1) Create & Use a Guest Cart
-
POST /cartscreates a cart bound todeviceId(cookie). -
POST /carts/{id}/itemsupserts a line:- If
ITEM#exists →qty += delta; else insert. - Store
idemKeyon the line row or keep a per-cartIDEM#<key>row.
- If
-
Cache the cart JSON in Redis (
cart:<id>, TTL 10–30s). On writes, publishCART_UPDATED:<id>to invalidate.
2) Login → Attach or Merge
-
If user has no active cart: attach guest cart (txn: set
META.userId, writeUSER#, deleteDEVICE#). -
If both exist: merge guest → user cart (single transaction):
- Read both carts; group by
(sku, attrs); sum quantities. - Keep most recent
priceSnapshotfor display (re-price at checkout). TransactWriteItems: upsert merged lines into target; delete source lines; updateUSER#; removeDEVICE#.- Emit
CartMergedevent.
- Read both carts; group by
3) Mutations
PATCHsets absolute quantity.qty <= 0→ delete line.- Use
ConditionExpressiononupdatedAtoritemVersionto avoid lost updates.
4) Reads
- Attempt Redis → fall back to DynamoDB
QueryonPK=CART#<id>. - Optional enrichment (title, image, price range) via Catalog cache keyed by
sku.
Idempotency & Concurrency
Idempotency-Keyscoped to(cartId, operation). Persist a compact record{hash(request), resultRef, expiresAt}. Replays with matching hash return the same result.- Conditional writes protect against stale overwrites; compare
updatedAtoritemVersion. - Transactions (2–25 items) ensure atomic merge/attach.
- No global locks; contention remains per-cart and per-line.
Caching Strategy
- Redis stores full cart documents for hot reads; short TTL to minimize staleness.
- Write-through: after DynamoDB commit, refresh Redis; on mutation, publish invalidate message for multi-node caches.
- Do not CDN-cache personalized cart APIs.
Pricing & Promotions
priceSnapshotis a display hint only. Checkout re-prices all lines and promotions.- Promotions are computed by a dedicated service; write derived discount lines:
CART#<id> ITEM#DISC#<promoId> DiscountLine {amount, reason, scope: "cart"|"item:<itemId>"}
- On any material change (add/remove/qty), discount lines are recomputed.
SLOs, Metrics, and Alarms
- Performance: p50 < 50–80 ms; p95 < 200 ms for read/write (under typical cache hit rates).
- Availability: 99.9% monthly for
GET/POST /items. - Golden Signals: RPS, error rate, p95 latency, DynamoDB throttles, Redis hit rate, transaction failure rate.
- Canaries: synthetic user add/remove every minute. Alarms on sustained error/latency elevation.
Capacity & Cost Sketch
Assume 100M monthly cart operations, 60% reads / 40% writes; avg item size 1 KB; avg cart read returns 4–6 KB.
DynamoDB
- On-demand to start. With steady traffic, move to provisioned + auto scaling.
- Typical cost: low hundreds to low thousands USD/month depending on request volume and item sizes.
Redis
- 2× cache.t3.medium (cluster-mode) ≈ low hundreds USD/month.
API
- API Gateway + Lambda (or ECS Fargate) ≈ low to mid hundreds USD/month, workload-dependent.
Measure with real traffic; tune TTLs and cache hit rate. Enable PITR on DynamoDB.
Failure Modes & Degradation
- DynamoDB throttling: serve last-known Redis value with a warning; enqueue write to SQS for retry as a last resort.
- Redis outage: read directly from DynamoDB; latency increases but correctness preserved.
- Merge races: transaction conflict → 409; client retries after refetch.
- Expired guest carts: return 404 and create a fresh cart; optionally soft-recover via
DEVICE#pointer if present.
Security & Compliance
- TLS everywhere; KMS encryption at rest (DynamoDB, Redis, S3).
- Least-privilege IAM (service can only access its table and related streams).
- JWT verification at API Gateway (Cognito authorizer). Device cookies are opaque IDs.
- Cart payloads avoid PII beyond IDs; never store payment data.
Testing & Resilience
- Unit: merge combiner, idempotency deduper, conditional expression builders.
- Contract: OpenAPI schemas; consumer-driven tests with checkout.
- Load: k6/Locust scenarios—steady 5k rps reads, 1k rps writes; flash spikes ×3.
- Chaos: Fault Injection Simulator (throttles, lat spikes); Redis eviction storms.
- Soak: 24h run to validate TTLs, memory growth, and retry logic.
Minimal Deployment Blueprint
- Compute: ECS Fargate (blue/green via CodeDeploy) or Lambda with canary release.
- Networking: Private subnets, VPC endpoints for DynamoDB/CloudWatch.
- Config: SSM Parameter Store (non-secrets), Secrets Manager (secrets).
- Infra as Code: Terraform or AWS CDK; enable PITR and alarms by default.
Reference Payloads
// POST /carts/{cartId}/items (Idempotency-Key: 1f82...)
{
"sku": "SKU-123",
"qty": 2,
"attrs": {"size":"M","color":"Navy"},
"priceSnapshot": {"amount": 1999, "currency": "USD"},
"source": "web"
}
// GET /carts/{cartId}
{
"cartId": "8d55...fe",
"currency": "USD",
"items": [
{
"itemId": "ITEM#SKU-123#size:M;color:Navy",
"sku": "SKU-123",
"qty": 2,
"attrs": {"size":"M","color":"Navy"},
"priceSnapshot": {"amount":1999,"currency":"USD"},
"updatedAt": "2025-10-18T10:20:00Z"
}
],
"discounts": [],
"updatedAt": "2025-10-18T10:20:00Z"
}