What you need to know about caching: Type of Cache,Distributed cache Strategies and Eviction Policies
from zero to millions of users
Caching is a critical optimization technique in system that improves performance by storing frequently accessed data in a faster, temporary storage layer. By reducing the need to retrieve data from slower sources such as databases or remote servers, caching increase speed and efficiency in applications ranging from web services to operating systems.
Hi there, It’s George here again with another System Design Newsletter
In this blog post, we will explore the type of Caching , different caching strategies and cache eviction policies, helping you understand how to design an effective caching system.
As stated earlier caching is an optimization technique in system design that improves performance by storing frequently accessed data in a faster temporary layer or memory. It is situated between the main memory (RAM) and the central processing unit (CPU) in a computer system by storing frequently accessed data closer to the CPU, cache enables quicker access times, reducing the latency associated with fetching data from slower main memory
Some Importance of Caching
Faster Data Access: By keeping or storing frequently accessed data near the processor, caching reduce delays associated with retrieving information from slower storage sources like disk drives or network servers.
Reduced Latency: Caching helps reduce the latency associated with accessing data from distant or slower storage devices. By keeping frequently used data readily available in cache memory, systems can minimize the delay experienced by users when accessing resources.
Scalability: Caching enhances system scalability by offloading processing and storage burdens from backend systems. By caching frequently accessed data, systems can efficiently handle increasing user loads without experiencing performance dropping.
Improved User Experience: Faster data access and reduced latency contribute to a smoother and more responsive user experience. Whether it's web browsing, accessing files, or running applications, caching helps deliver content and services quickly, leading to higher user satisfaction.
Types of Cache
The CPU cache, an essential part of modern processors, it enhances system performance by speeding up access to frequently used data and instructions. Let’s take a closer look at how it works while doing data my plan is not to bore you with low level stuff as my focus is more with distributed cache so we’re going to brief on this.
1. CPU Cache
The CPU cache, an essential part of modern processors, enhances system performance by speeding up access to frequently used data and instructions.
Characteristics
Proximity to CPU: CPU cache is located directly on the processor chip, providing extremely fast access compared to accessing data from main memory (RAM).
Hierarchical Structure: CPU cache is organized or structure into multiple levels (L1, L2, L3), each with varying sizes, speeds, and proximity to the CPU cores. L1 cache is the smallest and fastest, located closest to the CPU cores, followed by L2 and L3 caches, which are larger but slower.
Types of CPU Cache
L1 Cache: The L1 cache consists of two separate units: an instruction cache (L1i) and a data cache (L1d). It is the fastest but also the smallest, typically ranging from a few KBs to a few MBs per core.
L2 Cache: Serving as an intermediate cache, the L2 cache is larger than L1 but slightly slower. It may be dedicated to individual CPU cores or shared within a core complex.
L3 Cache: As the final cache level, L3 is the largest but also the slowest. It is shared among multiple CPU cores or even across different CPU sockets, enhancing overall data availability.
Use Cases and Benefits
Enhanced Performance: CPU caching significantly speeds up access to frequently used data and instructions, leading to faster execution and improved system responsiveness.
Lower Memory Latency: By keeping frequently accessed data closer to the CPU cores, the cache minimizes delays caused by retrieving data from main memory.
Increased Throughput: By reducing the number of stalls due to memory fetch delays, the cache allows processors to execute instructions more efficiently.
Optimized Multithreading: In multithreaded applications, caching helps enhance thread performance by reducing resource contention and minimizing cache thrashing.
Note that CPU cache is the faster then Main Memory but not CPU register
2. Disk Cache
Disk cache, also called buffer cache, enhances disk I/O performance by temporarily storing recently accessed data in memory. This reduces the need to repeatedly fetch data from the disk, improving access speeds and overall system efficiency.
Characteristics
RAM-Based Storage: The disk cache uses a portion of RAM to store recently accessed data blocks, reducing the need for frequent disk operations.
Write-Back vs. Write-Through: It can function in write-back mode, where data is first written to the cache and later flushed to disk, or write-through mode, where data is written to both cache and disk simultaneously.
Transparent Operation: The disk cache works seamlessly with the operating system and applications, intercepting disk I/O requests and managing data storage without user intervention.
Functionality
Read Caching: Frequently accessed data blocks are stored in memory. If a requested block is found in the cache (cache hit), it is retrieved instantly. If not (cache miss), it is fetched from the disk and cached for future use.
Write Caching: Write operations are cached to enhance performance. In write-back mode, data is first written to cache, reducing latency, and later flushed to disk asynchronously.
Prefetching: The cache anticipates future data requests based on access patterns, loading data into memory proactively to minimize disk access latency.
Types of Disk Cache
Operating System Cache: Many OSes include built-in disk caching mechanisms to optimize file system performance by buffering disk I/O operations.
Disk Controller Cache: Some disk controllers have onboard cache memory, improving performance independently from the OS and often featuring battery backup to protect cached data during power loss.
Disk Cache Software: Third-party software solutions can enhance or replace built-in caching mechanisms, offering advanced algorithms and optimization options.
Use Cases and Benefits
Improved Disk Performance: Caching reduces disk access times, accelerating both read and write operations.
Enhanced System Responsiveness: Faster data access results in a more responsive computing experience.
Reduced Disk Wear: By decreasing the number of physical disk accesses, caching can extend the lifespan of storage devices.
Optimized Resource Utilization: Caching reduces disk and CPU load, allowing the system to allocate resources more efficiently.
3. Web Cache
Web cache, also called a web proxy cache, is a technique used to store copies of web content—such as HTML pages, images, and multimedia files—locally on servers or client devices. This reduces load times, conserves bandwidth, and improves overall web performance
The proxy Server there in the diagram is the share cache server.
Characteristics
Proxy-Based Storage: Acting as an intermediary, web caches intercept client requests for web content, caching and serving it locally instead of retrieving it from the original web server each time.
Caching Policies: Web caches use various policies to decide which content to store and for how long. These policies are influenced by factors like content popularity, freshness, and available cache space.
Hierarchical Caching: In large-scale setups, web caches can be arranged hierarchically, where lower-level caches serve individual clients and higher-level caches function as intermediaries between lower-level caches and web servers.
Functionality
Content Caching: When a client requests a web resource, the cache checks for an available copy. If found (cache hit), the content is served directly from the cache. If not (cache miss), the cache fetches the resource from the web server and stores it for future use.
Cache Invalidation: Web caches ensure that outdated content is removed by using mechanisms like HTTP headers (Last-Modified, ETag) to revalidate and fetch the latest version of content from the server.
Content Compression: Some web caches use compression methods (gzip, brotli) to reduce the size of stored content, optimizing storage space and improving cache efficiency.
Types of Web Cache
Proxy Cache: Deployed at the network edge, proxy caches intercept and store web content requested by clients. They reduce latency and bandwidth usage by serving cached content directly.
Browser Cache: Web browsers cache local copies of previously visited web pages and resources. This speeds up subsequent visits to the same websites by loading content from the local cache.
Content Delivery Network (CDN) Cache: CDNs distribute content across a network of geographically distributed edge servers. These caches deliver content to users based on their proximity, improving response times and reducing latency.
Use Cases and Benefits
Faster Web Browsing: By serving cached content directly from local storage, web cache reduces the time spent fetching resources from remote servers, speeding up browsing.
Bandwidth Conservation: Web caching reduces the need for repeated downloads of the same content from remote servers, helping conserve bandwidth and reduce network traffic.
Improved Website Performance: Caching offloads content delivery tasks to local servers, easing the load on the main server, improving scalability, and enhancing website performance.
Characteristics
Local Storage: DNS cache is stored locally on client devices or DNS servers, holding recent domain name-to-IP mappings for a configurable period.
Time-to-Live (TTL): Each cached DNS record is assigned a TTL value, indicating how long the record remains valid in the cache. After the TTL expires, the record is discarded, and the DNS resolver must query the authoritative servers for updated information.
Hierarchical Caching: DNS cache can be organized hierarchically. Local caches serve individual clients, while intermediate DNS servers cache records on behalf of multiple clients.
Functionality
DNS Resolution Caching: DNS cache stores mappings of domain names to IP addresses. When a client requests the IP address of a domain, the DNS resolver first checks the local cache. If the mapping is found (cache hit), it’s returned immediately. If not (cache miss), the resolver queries the authoritative DNS servers and caches the result for future use.
Negative Caching: DNS cache also stores negative responses, such as failed resolution attempts (e.g., non-existent domains). This reduces the load on authoritative DNS servers and improves overall resolution efficiency.
Types of DNS Cache
Client-Side DNS Cache: Devices like computers, smartphones, and routers maintain their own local caches to store DNS records for frequently visited domain names, speeding up subsequent lookups and enhancing browsing performance.
DNS Server Cache: DNS servers (e.g., recursive DNS servers by ISPs, or public services like Google Public DNS) maintain caches for multiple clients. These caches reduce DNS lookup times and improve network performance by serving cached records instead of querying authoritative servers.
Use Cases and Benefits
Faster DNS Resolution: DNS cache speeds up DNS lookups by returning cached IP mappings, reducing the time spent querying external DNS servers.
Improved Network Performance: Local DNS caching reduces lookup latency, especially for frequently visited websites, improving overall network performance.
Reduced DNS Query Traffic: By serving cached records, DNS cache decreases the number of queries sent to authoritative DNS servers, reducing network traffic and easing the load on DNS infrastructure.
Enhanced Reliability: DNS cache improves reliability by ensuring that cached records are available even if authoritative DNS servers are unreachable or down, ensuring uninterrupted access to services.
Types of Caching Strategies
Caching strategies determine how data is written to and read from the cache. The right strategy depends on system requirements such as consistency, performance, and reliability.
1. Write-Through Cache
Data is written to both the cache and the underlying storage simultaneously.
Ensures data consistency between the cache and persistent storage.
Slower write performance due to dual writes.
Common in database systems where strong consistency is needed.
2. Write-Back Cache
Data is written to the cache first and later persisted to storage asynchronously.
Improves write performance but introduces a risk of data loss if the cache fails before syncing.
Used in scenarios where performance is a priority, such as disk caching.
3. Write-Around Cache
Writes data directly to storage, bypassing the cache.
Reduces cache pollution for infrequently accessed data.
Slower read performance if data is needed soon after writing.
Useful for systems with large datasets where only specific items are frequently accessed.
4. Read-Through Cache
If requested data is not found in the cache, it is fetched from storage and stored in the cache before being returned.
Ensures data availability but increases read latency on cache misses.
Commonly used in content delivery networks (CDNs).
5. Lazy Loading (Cache-Aside)
The application checks the cache first; if the data is missing, it fetches from storage and adds it to the cache.
Reduces cache storage usage but introduces higher latency for cache misses.
Suitable for applications where not all data needs to be cached.
Cache Eviction Policies
Cache eviction policies determine which data is removed from the cache when it reaches capacity. Choosing the right eviction strategy is crucial for maintaining optimal cache performance.
1. Least Recently Used (LRU)
Removes the least recently accessed item when the cache is full.
Ensures frequently used items remain in the cache.
Common in database query caches and in-memory caching systems like Redis.
2. Least Frequently Used (LFU)
Evicts the item accessed the least number of times.
More memory-intensive than LRU as it requires tracking access counts.
Useful in systems where certain items are accessed repeatedly over time.
3. First-In-First-Out (FIFO)
Evicts the oldest item first, regardless of access frequency.
Simple to implement but can lead to suboptimal caching performance.
Used in limited-resource embedded systems where complexity needs to be minimal.
4. Random Replacement (RR)
Removes a randomly chosen item.
Fast but inefficient in ensuring frequently used items remain cached.
Useful in cases where access patterns are unpredictable.
5. Time-To-Live (TTL) Based Eviction
Items expire after a specified time period.
Helps prevent stale data from being served.
Often used in DNS caching and web session management.
Choosing the Right Strategy
When designing a caching system, consider the following:
Data Consistency: Write-through caching ensures strong consistency, while write-back caching may lead to temporary inconsistencies.
Performance Needs: Write-back caching and lazy loading improve performance but may introduce stale data.
Cache Storage Limits: LFU and LRU optimize storage usage for frequently accessed items.
Use Case: Applications with real-time data (e.g., stock prices) may benefit from TTL-based eviction, while database query caches work well with LRU.
Conclusion
Caching is a powerful tool for optimizing system performance. In this post, I’ve done my best to cover everything I understand about caching and how to use it. I hope you learn something new, and I appreciate your patience. Give yourself a pat on the back for me!
References
System Design Interview by Alex Xu