URL Shortener Design

Why This Problem Matters

The URL shortener is the "hello world" of system design — deceptively simple on the surface, rich with real engineering decisions underneath. It touches hashing, database design, caching, load balancing, and analytics. Understanding it deeply gives you a framework for thinking about distributed systems.

Requirements & Constraints

Let's define the system clearly before designing anything:

Functional: Shorten a URL, redirect from short URL to original, optional custom aliases, optional expiration
Non-functional: Low latency redirects (<100ms p99), high availability (99.9%), eventual consistency is acceptable, analytics on click counts
Scale assumptions: 100M URLs created/month, 10:1 read-to-write ratio (1B redirects/month), 5-year data retention

Capacity Estimation

Working through the numbers: 100M URLs/month × 12 months × 5 years = 6B total URLs. Each record ~500 bytes (original URL + metadata) = ~3TB storage. At 1B redirects/month, that's ~380 reads/second average, ~1900/second peak (5x). Write rate: ~38/second average.

This is a read-heavy system. The ratio tells us caching will be our primary optimization lever.

Rule of thumb: Always start with the numbers. Capacity estimation isn't about precision — it's about understanding whether you need one server or a hundred, a simple cache or a distributed one.

Key Generation: Base62 Encoding

A 7-character Base62 key (a-z, A-Z, 0-9) gives us 62^7 = ~3.5 trillion unique combinations — more than enough for our 6B requirement. I prefer pre-generating keys and storing them in a key database over computing hashes at write time:

No collision handling needed (keys are unique by construction)
Constant-time key assignment vs hash computation + collision check
Key ranges can be partitioned across multiple application servers without coordination

Database Design

Two options: SQL (PostgreSQL) or NoSQL (DynamoDB/Cassandra). For this workload, I'd choose NoSQL. The access pattern is simple — point lookups by short key. No complex joins. No transactions across records. NoSQL databases excel at exactly this: high-throughput key-value lookups with horizontal scalability.

Schema is minimal: short_key (partition key), original_url, created_at, expires_at, user_id (optional), click_count.

Caching Strategy

With a 10:1 read-write ratio, caching is critical. A Redis cluster in front of the database with an LRU eviction policy handles this well. The 80/20 rule applies — 20% of URLs likely account for 80% of traffic. Caching even 10-20% of URLs could serve most reads from memory.

Cache-aside pattern: Check cache → miss → read DB → populate cache → return. Simple, proven.
TTL: Set cache TTL slightly shorter than URL expiration to avoid serving expired links.
Cache size: 20% of daily active URLs × 500 bytes ≈ modest memory requirement, easily fits in a single Redis instance initially.

Scaling Considerations

The system scales horizontally at every layer:

Application servers: Stateless — add more behind a load balancer
Database: Partition by short_key hash. Each partition handles a subset of the keyspace independently.
Cache: Consistent hashing across Redis nodes prevents cache stampede during node additions.
Analytics: Click counting can be async — write to a Kafka topic, aggregate with a stream processor, update counts in batches.

Takeaway: System design is about understanding tradeoffs, not memorizing architectures. Every decision — SQL vs NoSQL, cache-aside vs write-through, sync vs async analytics — is a tradeoff between simplicity, performance, consistency, and cost. Be able to articulate why you chose what you chose.