System Design Series — Part 11
A few years ago, I worked on a project where a single database query took around 300 milliseconds.
That doesn't sound bad.
Until you realize:
10,000 users were executing the same query repeatedly.
The database CPU started increasing.
Response times became slower.
Users complained.
Infrastructure costs grew.
The interesting part?
The data being requested was almost identical for everyone.
We weren't facing a database problem.
We were facing a caching problem.
Today, almost every large-scale application relies heavily on caching.
Netflix.
Amazon.
Instagram.
Uber.
YouTube.
Spotify.
Without caching, many of these platforms would struggle to handle their traffic efficiently.
Let's understand why.
The Real Problem
Imagine an e-commerce website.
A user opens a product page.
The application performs:
-
Database query
-
Business logic
-
Data processing
-
Response generation
Now imagine:
100,000 users open the same product page.
Without caching:
The database executes the same query 100,000 times.
This creates:
-
High database load
-
Increased latency
-
Higher infrastructure costs
-
Poor scalability
Clearly, this isn't efficient.
A Simple Real-World Analogy
Imagine a library.
Every time someone asks for a popular book:
The librarian goes to a storage warehouse located 10 kilometers away.
Returns with the book.
Repeats the process for every visitor.
That would be incredibly slow.
A smarter approach is:
Keep frequently requested books on a nearby shelf.
Popular books become instantly available.
Caching works exactly the same way.
What is Caching?
Caching is the process of storing frequently accessed data in a fast storage layer.
Instead of repeatedly retrieving data from the database:
Applications first check the cache.
Architecture:
User
↓
Application
↓
Cache
↓
Database (if needed)
The goal is simple:
Return data faster while reducing backend workload.
How Caching Works
Let's say a user requests product information.
Step 1
Application checks cache.
↓
Step 2
If data exists:
Return immediately.
(Cache Hit)
↓
Step 3
If data does not exist:
Query database.
(Cache Miss)
↓
Step 4
Store result in cache.
↓
Step 5
Future requests use cached data.
This process can reduce response times dramatically.
Cache Hit vs Cache Miss
Cache Hit
Requested data already exists in cache.
Result:
-
Fast response
-
No database query
-
Better performance
This is what we want.
Cache Miss
Data isn't found in cache.
The application must:
-
Query database
-
Generate response
-
Store result
Cache misses are expensive.
Good system design aims to maximize cache hits.
Why Companies Use Caching
Faster Response Times
Cache access is significantly faster than database access.
Example:
Database Query:
200 ms
Cache Lookup:
2 ms
That's often a 100x improvement.
Reduced Database Load
Instead of millions of identical queries:
The database handles only a small percentage.
This improves overall system stability.
Better Scalability
Applications can serve more users without constantly upgrading databases.
This is one of the reasons large-scale systems scale efficiently.
Lower Infrastructure Costs
Less database traffic means:
-
Fewer resources
-
Lower cloud costs
-
Better efficiency
Real-World Example
Imagine Instagram.
Millions of users open profiles repeatedly.
Without caching:
Every profile view would require a database query.
Instead:
Popular profile information is cached.
Result:
-
Faster loading
-
Lower database pressure
-
Better user experience
The same principle applies to:
-
Product pages
-
User profiles
-
News feeds
-
Search results
Popular Caching Technologies
Redis
The most popular caching solution today.
Used by:
-
Netflix
-
Uber
-
Instagram
-
Airbnb
Known for:
-
Speed
-
Simplicity
-
Scalability
Memcached
Another popular distributed caching system.
Often used for simple key-value caching.
Types of Caching
Application Cache
Stored inside application memory.
Very fast.
Limited by server resources.
Distributed Cache
Shared across multiple servers.
Examples:
-
Redis
-
Memcached
Most modern systems use distributed caching.
CDN Cache
Stores:
-
Images
-
CSS
-
JavaScript
-
Videos
Closer to users.
This is what we discussed in the CDN article.
Database Cache
Some databases maintain internal caches automatically.
This helps reduce disk access.
Where is Cache Placed?
A typical architecture:
Users
↓
CDN
↓
Load Balancer
↓
Application Servers
↓
Redis Cache
↓
Database
Notice something important:
The cache sits before the database.
That's intentional.
The database is usually the most expensive component.
Caching protects it.
Cache Eviction
Cache storage isn't unlimited.
Eventually old data must be removed.
Common strategies include:
LRU (Least Recently Used)
Remove data that hasn't been used recently.
Most common approach.
TTL (Time To Live)
Data automatically expires after a specific duration.
Example:
Cache product details for 30 minutes.
Production Challenges
Caching sounds simple.
But it introduces new problems.
Stale Data
What happens when cached data becomes outdated?
Example:
Product price changes.
Cache still returns old value.
This is one of the hardest problems in distributed systems.
Cache Invalidation
Developers often joke:
"There are only two hard things in Computer Science:
Cache invalidation and naming things."
Because keeping cache synchronized with databases is challenging.
Real-World Architecture
Large companies often use:
Users
↓
CDN
↓
Load Balancer
↓
API Gateway
↓
Application Servers
↓
Redis Cluster
↓
Database Cluster
Notice:
Caching appears before expensive database operations.
That's one reason large platforms remain responsive under massive traffic.
Common Developer Mistakes
Some mistakes I frequently see:
Caching Everything
Not all data should be cached.
Frequently changing data may not benefit.
No Expiration Strategy
Old data remains forever.
Users receive outdated information.
Ignoring Cache Failures
Applications should continue working even if cache becomes unavailable.
Treating Cache as Primary Storage
Never assume cache data is permanent.
Cache should be considered temporary.
Interview Tip
A common System Design interview question is:
"How would you reduce database load?"
One of the first answers should be:
Caching.
A strong answer should cover:
-
Cache Hits
-
Cache Misses
-
Redis
-
Cache Eviction
-
Cache Invalidation
-
Performance Improvements
These concepts appear frequently in backend and system design interviews.
Key Takeaways
✔ Caching stores frequently used data in fast storage
✔ It reduces database load
✔ It improves response times
✔ It helps systems scale efficiently
✔ Redis is one of the most widely used caching technologies
✔ Cache invalidation is one of the biggest challenges in distributed systems
✔ Nearly every large-scale application relies heavily on caching
This is Part 11 of the System Design Simplified series.
Next Article:
Part 12 — Database Replication Explained Simply
If this article helped you understand caching better, consider sharing it with fellow developers.
#SystemDesign #Caching #Redis #SoftwareArchitecture #BackendDevelopment #DistributedSystems #Scalability #CloudComputing #SoftwareEngineering #SystemDesignInterview #BackendEngineer #PerformanceOptimization #DatabaseDesign #TechArchitecture #Programming