Designing a URL Shortener from Scratch
A complete walkthrough — requirements, capacity estimation, database design, caching strategy, and scaling considerations.
Why This Problem Matters
The URL shortener is the "hello world" of system design — deceptively simple on the surface, rich with real engineering decisions underneath. It touches hashing, database design, caching, load balancing, and analytics. Understanding it deeply gives you a framework for thinking about distributed systems.
Requirements & Constraints
Let's define the system clearly before designing anything:
- Functional: Shorten a URL, redirect from short URL to original, optional custom aliases, optional expiration
- Non-functional: Low latency redirects (<100ms p99), high availability (99.9%), eventual consistency is acceptable, analytics on click counts
- Scale assumptions: 100M URLs created/month, 10:1 read-to-write ratio (1B redirects/month), 5-year data retention
Capacity Estimation
Working through the numbers: 100M URLs/month × 12 months × 5 years = 6B total URLs. Each record ~500 bytes (original URL + metadata) = ~3TB storage. At 1B redirects/month, that's ~380 reads/second average, ~1900/second peak (5x). Write rate: ~38/second average.
This is a read-heavy system. The ratio tells us caching will be our primary optimization lever.
Key Generation: Base62 Encoding
A 7-character Base62 key (a-z, A-Z, 0-9) gives us 62^7 = ~3.5 trillion unique combinations — more than enough for our 6B requirement. I prefer pre-generating keys and storing them in a key database over computing hashes at write time:
- No collision handling needed (keys are unique by construction)
- Constant-time key assignment vs hash computation + collision check
- Key ranges can be partitioned across multiple application servers without coordination
Database Design
Two options: SQL (PostgreSQL) or NoSQL (DynamoDB/Cassandra). For this workload, I'd choose NoSQL. The access pattern is simple — point lookups by short key. No complex joins. No transactions across records. NoSQL databases excel at exactly this: high-throughput key-value lookups with horizontal scalability.
Schema is minimal: short_key (partition key), original_url, created_at, expires_at, user_id (optional), click_count.
Caching Strategy
With a 10:1 read-write ratio, caching is critical. A Redis cluster in front of the database with an LRU eviction policy handles this well. The 80/20 rule applies — 20% of URLs likely account for 80% of traffic. Caching even 10-20% of URLs could serve most reads from memory.
- Cache-aside pattern: Check cache → miss → read DB → populate cache → return. Simple, proven.
- TTL: Set cache TTL slightly shorter than URL expiration to avoid serving expired links.
- Cache size: 20% of daily active URLs × 500 bytes ≈ modest memory requirement, easily fits in a single Redis instance initially.
Scaling Considerations
The system scales horizontally at every layer:
- Application servers: Stateless — add more behind a load balancer
- Database: Partition by short_key hash. Each partition handles a subset of the keyspace independently.
- Cache: Consistent hashing across Redis nodes prevents cache stampede during node additions.
- Analytics: Click counting can be async — write to a Kafka topic, aggregate with a stream processor, update counts in batches.