Smart Mock Interview Logo
Get Started
Amazon Interview Question Designing a shopping cart that supports millions of concurrent users
Technical

Amazon Interview Question Designing a shopping cart that supports millions of concurrent users

SSmart Mock Interview

TL;DR

  • DynamoDB as the authoritative cart store, Redis for low-latency reads.
  • Idempotent mutations + conditional writes + DynamoDB transactions to prevent duplicates and lost updates.
  • Guest → user merge is a single atomic transaction; no global locks.
  • Pricing snapshots are for display only; re-price at checkout.
  • Short TTLs for guest carts; signed-in carts never expire.
  • EventBridge broadcasts cart changes for analytics and personalization.

Architecture Overview

  • Edge: Route 53 → CloudFront → WAF → S3 (SPA)
  • Auth: Amazon Cognito (user pools). Anonymous users hold a long-lived deviceId cookie.
  • API: API Gateway (HTTP) → ECS/Fargate or Lambda (cart-svc).
  • State: DynamoDB (source of truth)
  • Cache: ElastiCache Redis (hot reads + pub/sub invalidations)
  • Events: DynamoDB Streams → EventBridge (e.g., CartUpdated, CartMerged)
  • Observability: CloudWatch + X-Ray + OTEL traces; canaries.
flowchart LR
  subgraph Edge
    CF[CloudFront] --> WAF[WAF]
    WAF --> UI[S3 Web App]
  end

  CF --> APIGW[API Gateway]
  APIGW --> CART[cart-svc (ECS/Lambda)]
  CART --> DDB[(DynamoDB)]
  CART --> REDIS[(ElastiCache Redis)]
  DDB -- Streams --> EVB[(EventBridge)]
  CART --> XRay[X-Ray/OTEL]

Data Model (Single-Table DynamoDB)

Partition cart state by cartId (high cardinality, uniform distribution). Maintain reverse pointers for quick lookup by user/device.

PK                          SK                 Type        Attributes
CART#<cartId>               META               Cart        userId?, deviceId?, currency, createdAt, updatedAt, ttl?
CART#<cartId>               ITEM#<sku>#<key>   CartItem    qty, priceSnapshot{amount,currency}, attrs{size,color},
                                                       promoCodes[], updatedAt, source, idemKey?
USER#<userId>               ACTIVE_CART        UserIndex   cartId, updatedAt
DEVICE#<deviceId>           ACTIVE_CART        DeviceIdx   cartId, updatedAt

Notes

  • ttl only on guest carts (e.g., 30 days). Signed-in carts do not expire.
  • ITEM#<sku>#<key> allows multiple distinct lines of the same SKU (e.g., different color/size). If not needed, omit #<key>.
  • GSI1: GSI1PK = USER#<userId>, GSI1SK = ACTIVE_CART for O(1) user→cart lookup.
  • GSI2: GSI2PK = DEVICE#<deviceId>, GSI2SK = ACTIVE_CART for guest cart recovery.

API Surface (Idempotent by Default)

All mutating endpoints accept Idempotency-Key (UUID). The service stores the key per cart and returns the prior result if replayed.

POST   /carts                           -> { cartId }                    # create guest cart
GET    /carts/{cartId}                  -> { items[], currency, ... }
POST   /carts/{cartId}/items            -> {sku, qty, attrs?, priceSnapshot?, source?}
PATCH  /carts/{cartId}/items/{itemId}   -> {qty, promoCodes?}
DELETE /carts/{cartId}/items/{itemId}

# Auth-required
POST   /carts/attach                    -> {cartId}   # attach guest cart to signed-in user
POST   /carts/merge                     -> {guestCartId}
GET    /me/cart                         -> current cart for principal

Common Errors

  • 409 CONFLICT: conditional write failed (stale merge / race); retry with fresh reads.
  • 409 IDEMPOTENT_REPLAY: returning previous success payload.
  • 422 UNPROCESSABLE: invalid payload (e.g., negative qty, malformed attrs).

Core Flows

1) Create & Use a Guest Cart

  1. POST /carts creates a cart bound to deviceId (cookie).

  2. POST /carts/{id}/items upserts a line:

    • If ITEM# exists → qty += delta; else insert.
    • Store idemKey on the line row or keep a per-cart IDEM#<key> row.
  3. Cache the cart JSON in Redis (cart:<id>, TTL 10–30s). On writes, publish CART_UPDATED:<id> to invalidate.

2) Login → Attach or Merge

  • If user has no active cart: attach guest cart (txn: set META.userId, write USER#, delete DEVICE#).

  • If both exist: merge guest → user cart (single transaction):

    • Read both carts; group by (sku, attrs); sum quantities.
    • Keep most recent priceSnapshot for display (re-price at checkout).
    • TransactWriteItems: upsert merged lines into target; delete source lines; update USER#; remove DEVICE#.
    • Emit CartMerged event.

3) Mutations

  • PATCH sets absolute quantity. qty <= 0 → delete line.
  • Use ConditionExpression on updatedAt or itemVersion to avoid lost updates.

4) Reads

  • Attempt Redis → fall back to DynamoDB Query on PK=CART#<id>.
  • Optional enrichment (title, image, price range) via Catalog cache keyed by sku.

Idempotency & Concurrency

  • Idempotency-Key scoped to (cartId, operation). Persist a compact record {hash(request), resultRef, expiresAt}. Replays with matching hash return the same result.
  • Conditional writes protect against stale overwrites; compare updatedAt or itemVersion.
  • Transactions (2–25 items) ensure atomic merge/attach.
  • No global locks; contention remains per-cart and per-line.

Caching Strategy

  • Redis stores full cart documents for hot reads; short TTL to minimize staleness.
  • Write-through: after DynamoDB commit, refresh Redis; on mutation, publish invalidate message for multi-node caches.
  • Do not CDN-cache personalized cart APIs.

Pricing & Promotions

  • priceSnapshot is a display hint only. Checkout re-prices all lines and promotions.
  • Promotions are computed by a dedicated service; write derived discount lines:
CART#<id>  ITEM#DISC#<promoId>   DiscountLine {amount, reason, scope: "cart"|"item:<itemId>"}
  • On any material change (add/remove/qty), discount lines are recomputed.

SLOs, Metrics, and Alarms

  • Performance: p50 < 50–80 ms; p95 < 200 ms for read/write (under typical cache hit rates).
  • Availability: 99.9% monthly for GET/POST /items.
  • Golden Signals: RPS, error rate, p95 latency, DynamoDB throttles, Redis hit rate, transaction failure rate.
  • Canaries: synthetic user add/remove every minute. Alarms on sustained error/latency elevation.

Capacity & Cost Sketch

Assume 100M monthly cart operations, 60% reads / 40% writes; avg item size 1 KB; avg cart read returns 4–6 KB.

DynamoDB

  • On-demand to start. With steady traffic, move to provisioned + auto scaling.
  • Typical cost: low hundreds to low thousands USD/month depending on request volume and item sizes.

Redis

  • 2× cache.t3.medium (cluster-mode) ≈ low hundreds USD/month.

API

  • API Gateway + Lambda (or ECS Fargate) ≈ low to mid hundreds USD/month, workload-dependent.

Measure with real traffic; tune TTLs and cache hit rate. Enable PITR on DynamoDB.


Failure Modes & Degradation

  • DynamoDB throttling: serve last-known Redis value with a warning; enqueue write to SQS for retry as a last resort.
  • Redis outage: read directly from DynamoDB; latency increases but correctness preserved.
  • Merge races: transaction conflict → 409; client retries after refetch.
  • Expired guest carts: return 404 and create a fresh cart; optionally soft-recover via DEVICE# pointer if present.

Security & Compliance

  • TLS everywhere; KMS encryption at rest (DynamoDB, Redis, S3).
  • Least-privilege IAM (service can only access its table and related streams).
  • JWT verification at API Gateway (Cognito authorizer). Device cookies are opaque IDs.
  • Cart payloads avoid PII beyond IDs; never store payment data.

Testing & Resilience

  • Unit: merge combiner, idempotency deduper, conditional expression builders.
  • Contract: OpenAPI schemas; consumer-driven tests with checkout.
  • Load: k6/Locust scenarios—steady 5k rps reads, 1k rps writes; flash spikes ×3.
  • Chaos: Fault Injection Simulator (throttles, lat spikes); Redis eviction storms.
  • Soak: 24h run to validate TTLs, memory growth, and retry logic.

Minimal Deployment Blueprint

  • Compute: ECS Fargate (blue/green via CodeDeploy) or Lambda with canary release.
  • Networking: Private subnets, VPC endpoints for DynamoDB/CloudWatch.
  • Config: SSM Parameter Store (non-secrets), Secrets Manager (secrets).
  • Infra as Code: Terraform or AWS CDK; enable PITR and alarms by default.

Reference Payloads

// POST /carts/{cartId}/items  (Idempotency-Key: 1f82...)
{
  "sku": "SKU-123",
  "qty": 2,
  "attrs": {"size":"M","color":"Navy"},
  "priceSnapshot": {"amount": 1999, "currency": "USD"},
  "source": "web"
}
// GET /carts/{cartId}
{
  "cartId": "8d55...fe",
  "currency": "USD",
  "items": [
    {
      "itemId": "ITEM#SKU-123#size:M;color:Navy",
      "sku": "SKU-123",
      "qty": 2,
      "attrs": {"size":"M","color":"Navy"},
      "priceSnapshot": {"amount":1999,"currency":"USD"},
      "updatedAt": "2025-10-18T10:20:00Z"
    }
  ],
  "discounts": [],
  "updatedAt": "2025-10-18T10:20:00Z"
}