Web Caching Strategies: Browser, CDN, and Server-Side
Caching is the single highest-leverage performance optimization available to web developers. A correctly cached response takes microseconds; an uncached response can take hundreds of milliseconds. This guide covers every caching layer from browser to database, with the exact headers and patterns that work in production.
The Caching Stack: Four Layers
A modern web request passes through multiple potential caches before reaching your application server. Understanding each layer is essential because they operate independently, have different invalidation mechanisms, and serve different purposes:
- Browser cache: Local storage in the user's browser. Zero network cost on a cache hit. Controlled by response headers. Cannot be invalidated directly - only expired or bypassed.
- CDN / edge cache: Distributed servers close to users (Cloudflare, Fastly, CloudFront). Reduces origin load and improves latency globally. Can be purged via API.
- Reverse proxy / application cache: Varnish, Nginx proxy_cache, or application-level caching. Sits in front of your application server.
- Application-level cache: Redis or Memcached. Caches expensive database query results, computed values, or session data inside your application code.
Layer 1: Browser Caching and Cache-Control
The Cache-Control response header is the primary mechanism for controlling browser and CDN caching behavior. It replaced the older Expires header and offers much finer control.
The three most important Cache-Control patterns
# Static assets with content hash in filename (cache forever)
# e.g., /static/app.3f7a2b.js
Cache-Control: public, max-age=31536000, immutable
# HTML pages (always revalidate, allow CDN caching)
Cache-Control: public, no-cache
ETag: "abc123"
# API responses with authenticated/personalized data (never cache)
Cache-Control: private, no-store
# Shared API responses (cache, allow stale while fetching fresh)
Cache-Control: public, max-age=60, stale-while-revalidate=300
Cache-Control directive reference
public- Can be cached by any cache (browser, CDN, proxy)private- Only the end-user's browser can cache it; CDNs must notmax-age=N- Cache is fresh for N secondsno-cache- Cache must revalidate with the server before using the cached copy (despite the name, it does not prevent caching)no-store- Do not cache at all, ever. Use for sensitive data like banking pagesimmutable- The resource will never change; no revalidation needed during max-age period. Significant performance win for fingerprinted assets.stale-while-revalidate=N- Serve stale content while fetching a fresh copy in the background (for N seconds after max-age expires). Eliminates revalidation latency from the user's perspective.stale-if-error=N- Serve stale content if the origin returns an error, for N seconds. Great for availability during origin outages.s-maxage=N- Like max-age but only for shared caches (CDNs). Overrides max-age for CDNs. Lets you have a short browser TTL but a long CDN TTL.
Layer 2: Conditional Requests and ETags
ETags and Last-Modified enable conditional requests - a mechanism where the browser asks the server "has this changed?" and gets a 304 Not Modified response (no body) if it has not. This saves bandwidth without sacrificing freshness.
How ETags work
# Server sends:
HTTP/1.1 200 OK
ETag: "v1-abc123-1711555200"
Cache-Control: no-cache
Content-Type: text/html
# Browser caches the response. Next request sends:
GET /page.html HTTP/1.1
If-None-Match: "v1-abc123-1711555200"
# If unchanged, server responds:
HTTP/1.1 304 Not Modified
ETag: "v1-abc123-1711555200"
# (no body - saves the full page download)
Generate strong ETags from a hash of the content (e.g., MD5 of the response body) or a content version identifier. Weak ETags (prefixed with W/) indicate semantic equivalence without byte-for-byte identity - suitable when gzip encoding varies the bytes.
Layer 3: CDN Caching
CDNs like Cloudflare, AWS CloudFront, and Fastly add an edge-caching layer between your origin server and users worldwide. They respect Cache-Control and Vary headers, but each has additional configuration options.
Key CDN caching concepts
- Cache key: By default, CDNs cache by URL. The
Varyheader tells the CDN to also vary the cache by request header values (e.g.,Vary: Accept-Encodingcreates separate cached copies for gzip vs. brotli responses). Misuse ofVarycan destroy CDN cache efficiency. - Bypass rules: Always bypass the CDN cache for authenticated requests, shopping cart pages, checkout flows, and any URL with a session cookie that affects the response.
- Origin shield: Many CDNs offer a single "shield" PoP that fetches from your origin and serves other PoPs. This concentrates origin traffic and dramatically improves cache fill efficiency for low-traffic content.
- Purge API: Unlike browser caches, CDN caches can be purged on demand. Always purge CDN cache after deployments that change HTML or CSS. Use tag-based purging when available (Cloudflare Cache Tags, Fastly surrogate keys).
Practical Cloudflare configuration example
# nginx origin headers for Cloudflare
location ~* \.(js|css|png|jpg|webp|woff2)$ {
# Cloudflare respects Cache-Control
add_header Cache-Control "public, max-age=31536000, immutable";
# Cloudflare-specific: allow edge cache even without ETag
add_header CF-Cache-Status $upstream_http_cf_cache_status;
}
location / {
# s-maxage for CDN, shorter max-age for browsers
add_header Cache-Control "public, s-maxage=3600, max-age=0, stale-while-revalidate=86400";
}
Layer 4: Server-Side Caching (Varnish, Redis)
Varnish: Full HTTP reverse proxy cache
Varnish sits in front of your application server and caches full HTTP responses in memory. It is extremely fast (millions of requests per second on a single node) and fully programmable via VCL (Varnish Configuration Language).
# VCL: cache public pages, bypass for authenticated users
sub vcl_recv {
# Bypass cache if user is logged in
if (req.http.Cookie ~ "session_id") {
return (pass);
}
# Remove analytics cookies that would vary the cache key
set req.http.Cookie = regsuball(req.http.Cookie, "(_ga|_gid|_fbp)=[^;]+(; )?", "");
if (req.http.Cookie == "") {
unset req.http.Cookie;
}
}
sub vcl_backend_response {
# Cache all 200 responses for public content
if (beresp.status == 200 && !beresp.http.Set-Cookie) {
set beresp.ttl = 1h;
set beresp.grace = 24h; # Serve stale during origin downtime
}
}
Redis: Application-level caching
Redis is an in-memory data store used to cache expensive database query results, computed aggregations, and API responses inside your application code. Unlike Varnish (which caches complete HTTP responses), Redis caches arbitrary data.
# Python example: cache-aside pattern
import redis
import json
r = redis.Redis(host='localhost', port=6379, db=0)
def get_user_orders(user_id: int) -> list:
cache_key = f"user_orders:{user_id}"
# Try cache first
cached = r.get(cache_key)
if cached:
return json.loads(cached)
# Cache miss: fetch from database
orders = db.query("SELECT * FROM orders WHERE user_id = %s", user_id)
# Store in cache with 5-minute TTL
r.setex(cache_key, 300, json.dumps(orders))
return orders
def update_user_order(user_id: int, order_id: int, data: dict):
db.update_order(order_id, data)
# Invalidate the cached orders for this user
r.delete(f"user_orders:{user_id}")
Check HTTP Status Codes and Headers
Use our HTTP Status Codes reference to look up every 1xx, 2xx, 3xx, 4xx, and 5xx code with explanations. Free and always up to date.
Open HTTP Status Codes →Cache Invalidation Patterns
Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." Cache invalidation is hard because you need to remove stale data without serving stale responses, without causing cache stampedes, and without missing any cached copies.
Pattern 1: TTL-based expiration
The simplest approach: set a Time To Live (TTL) and let the cache expire naturally. Stale data is served for up to TTL seconds before the cache refreshes. Best for data where slight staleness is acceptable (product catalog, blog posts, public API responses).
Pattern 2: Cache-aside with explicit invalidation
Application code manages the cache explicitly. On read: check cache, populate on miss. On write: update the database, then delete the cache key. Simple and correct, but requires two trips on a cold cache and risks a "thundering herd" if many requests hit the cache simultaneously after invalidation.
Pattern 3: Write-through
On every write, update both the database and the cache in the same operation. Ensures cache is always warm. Risk: if the database write succeeds but the cache update fails, the cache is stale until TTL expiry.
Pattern 4: Event-driven invalidation
Use a message queue or event bus. When data changes, publish an event. Cache subscribers listen and invalidate affected keys. Scales well and decouples the write path from cache management, but adds infrastructure complexity.
Pattern 5: Versioned cache keys
Never invalidate - instead, change the cache key. Store a version counter in Redis. When data changes, increment the version. The old cached data becomes orphaned (eventually evicted by LRU) and the new key starts fresh. Eliminates invalidation race conditions entirely.
# Version-based cache key pattern
def get_cache_key(user_id: int) -> str:
version = r.get(f"user_orders_version:{user_id}") or 1
return f"user_orders:{user_id}:v{version}"
def invalidate_user_orders(user_id: int):
r.incr(f"user_orders_version:{user_id}")
Service Workers: Client-Side Programmable Cache
Service workers are JavaScript files that run in the background of the browser, intercepting network requests and enabling offline capability and fine-grained client side caching. They are the foundation of Progressive Web Apps (PWAs).
// service-worker.js
const CACHE_NAME = 'app-v1';
const STATIC_ASSETS = ['/app.css', '/app.js', '/offline.html'];
// Cache static assets on install
self.addEventListener('install', (event) => {
event.waitUntil(
caches.open(CACHE_NAME).then(cache => cache.addAll(STATIC_ASSETS))
);
});
// Stale-while-revalidate for navigation requests
self.addEventListener('fetch', (event) => {
if (event.request.mode === 'navigate') {
event.respondWith(
caches.open(CACHE_NAME).then(async cache => {
const cached = await cache.match(event.request);
const networkFetch = fetch(event.request).then(response => {
cache.put(event.request, response.clone());
return response;
});
return cached || networkFetch;
})
);
}
});
Frequently Asked Questions
What is the difference between no-cache and no-store?
no-cache does not mean "do not cache." It means the cache must revalidate with the origin before serving the cached response. The response is still stored locally; it just cannot be served without checking with the server first. no-store means do not store the response anywhere - not in the browser, not in a CDN, nowhere. Use no-store for sensitive pages like banking dashboards or medical records. Use no-cache for content that should always be fresh but where the overhead of re-downloading unchanged content is wasteful.
How do I cache-bust after a deployment?
The best approach for static assets is content-based fingerprinting: include a hash of the file content in the filename (e.g., app.3f7a2b.js). Tools like Webpack, Vite, and Parcel do this automatically. With a new hash in the filename, the URL changes on every deployment, so the browser always fetches the new file - no explicit cache busting needed. For HTML files (which reference these fingerprinted assets), set Cache-Control: no-cache so browsers always revalidate the HTML but still benefit from 304 responses when it has not changed.
Should I cache API responses at the CDN?
It depends on whether the response is the same for all users. Public API endpoints returning product listings, blog posts, or public data should absolutely be CDN-cached. Set Cache-Control: public, s-maxage=60 to cache at the CDN for 60 seconds. Authenticated or user-specific API responses must never be CDN-cached - use Cache-Control: private, no-store. The Vary: Authorization header is an alternative but most CDNs treat it as "do not cache."
What causes cache stampedes and how do I prevent them?
A cache stampede (also called "thundering herd") occurs when a popular cached item expires and dozens or hundreds of simultaneous requests all find the cache empty and hit the origin server simultaneously. Prevention strategies: use stale-while-revalidate to continue serving the old cached response while one request fetches the new one in the background; use a mutex/lock in your application so only one request regenerates the cache and others wait; or use probabilistic early expiration (PER) to refresh the cache slightly before it expires to avoid the cliff edge.
How does the Vary header affect caching?
The Vary header tells caches that the response varies based on specific request headers. For example, Vary: Accept-Encoding means the CDN stores a separate cached copy for gzip and brotli responses. Vary: Accept-Language stores separate copies per language. Be careful: Vary: Cookie or Vary: Authorization will effectively disable CDN caching for those responses since every user has different cookies or auth tokens. Design your API to avoid Vary on headers that differ per user for cacheable responses.
The Bottom Line
A correct caching strategy layered from browser to CDN to application dramatically reduces latency, origin load, and infrastructure costs. The rules of thumb: cache static assets forever with content hashes; set short CDN TTLs with stale-while-revalidate for HTML; use private, no-store for authenticated responses; and cache expensive database reads in Redis with explicit invalidation on writes. Start with correct Cache-Control headers - everything else builds on that foundation.
Use our free tool here → HTTP Status Codes Reference to look up status codes and understand how 304 responses interact with your caching strategy.
Usman has 10+ years of experience securing enterprise infrastructure, managing high-traffic servers, and building zero-knowledge security tools. Read more about the author.