4 minute read

Caching is utilized to speed up a system and reduce or improve the latency of that system.

Different operations or data transfers take varying amounts of time. Caching is a way to design a system so that it avoids time-consuming operations or data transfers, such as network requests, and instead relies on faster alternatives.

To put it more simply, caching involves storing data in a location different from its original source, allowing for quicker access to the data from this new location.

Caching can be implemented in various parts of a system:

  • Client-level caching: By caching data at the client level, the need for the client to request data from the server is eliminated, reducing latency.

  • Server-level caching: In some cases, the client may need to interact with the server, but the server doesn’t always need to query the database to retrieve data. In this scenario, caching data at the server level can help improve the system’s performance.

  • Intermediate caching: Caching can also be placed between two components in the system, such as between a server and a database. This approach can further optimize the system by reducing the need for direct communication between components.

Types of Caching: In-memory vs External

In-memory Caching stores data within the same operating system process as the consuming application. This approach provides low-latency access and is ideal for high-speed data access and smaller cache sizes.

External Caching manages the cache in a separate operating system process, either locally or remotely. Solutions like Redis or Memcached offer more scalability and flexibility, making this approach suitable for larger cache sizes, distributed systems, or shared caching across multiple application instances.

Content Delivery Network

When discussing caching, it’s important to mention Content Delivery Networks (CDNs). A CDN is a third-party service that functions as a cache for your web servers. Web applications may experience slow performance for users in specific regions if your servers are located far from those users. CDNs have servers distributed globally, ensuring lower latency compared to your own servers. These CDN servers are commonly referred to as PoPs (Points of Presence). Some popular CDNs include Cloudflare and Google Cloud CDN.

When Caching may be helpful?

  1. Speed up network operations and avoid redundancy
    Caching can be particularly helpful when you want to speed up network operations and avoid performing the same request multiple times. By caching the results of these requests, you can quickly serve the data without accessing the database repeatedly, leading to more efficient network operations.

  2. Accelerate computationally intensive operations
    In situations where a server performs a computationally expensive operation in response to a client’s request, caching can be beneficial in speeding up the process. By caching the results of these resource-intensive tasks, you can prevent the need to perform them multiple times, ultimately improving the system’s performance.

  3. Manage operations that occur frequently
    Caching is also advantageous when managing operations that are performed very frequently. By using caching, you can for example prevent overloading the database with numerous redundant requests. Although individual network requests might still be relatively fast, caching can help distribute the cached data among multiple clients, reducing the overall stress on the database.

Cache Invalidation

Cache invalidation is a crucial aspect of caching, as it ensures that the data served from the cache remains accurate and up-to-date. In scenarios where data is stored in multiple locations, such as a database and a server-side cache, maintaining consistency between these two sources of truth is essential.

Consider an example where you store posts in a database and also cache them on the server-side. In this case, both the database and the cache contain the same information, creating two sources of truth. The challenge lies in keeping these sources in sync when the data changes.

When a post is updated or deleted, the cached version needs to be invalidated or refreshed to maintain consistency between the cache and the database. There are several cache invalidation strategies to tackle this issue:

  • Write-through: Update the cache immediately when the data in the database changes. This approach ensures that the cache is always up-to-date, but it may lead to increased latency for write operations due to waiting for both cache and primary storage updates.
    • Advantages: Fast retrieval, complete data consistency, robust to system disruptions
    • Disadvantages: Higher latency for write operations.
  • Write-back: Write data to the cache first and asynchronously update the primary data storage later. This strategy allows for faster write operations by reducing wait times associated with primary storage updates. However, it may introduce the risk of data loss or inconsistency in case of a system failure before the data is synchronized with the primary storage.
    • Advantages: Faster write operations (by reducing wait times associated with primary storage updates), improved write performance, and reduced load on primary storage.
    • Disadvantages: Risk of data loss or inconsistency in case of a system failure before synchronization, potential stale data in primary storage, and added complexity for managing data synchronization.

By implementing an appropriate cache invalidation strategy, you can maintain data consistency between the cache and the primary data source, ensuring that your system serves accurate and up-to-date information.

Cache Eviction Policy

A cache eviction policy determines how values are removed from a cache when it reaches capacity. Eviction is necessary to maintain cache efficiency, free up space for new data, and ensure that the most relevant information is stored. Popular cache eviction policies include LRU (Least Recently Used), FIFO (First In First Out), and LFU (Least Frequently Used).

Each policy offers a different approach to prioritizing which data should be evicted to optimize cache performance based on the specific use case.