Monday, October 16, 2023

Lazy Update - Reduce Redis Load Method

 



In the following post we will review a method to reduce load on a Redis. I've have used this method in several projects, and it make the difference between a non working project with too high costs to a functioning project with reasonable costs.

When working with Redis, there are usually several pods processing work tasks, and updating the Redis with the results. There are several gradual steps in the process of a product maturity until it settles on the best method to use the Redis.

The Steps

The first an naive step is to update the Redis for each work task. The work task might be a user action, a web transaction, or a system event, and hence we expect huge amount of work tasks, and the implication is huge amount of Redis updates.

Trying to reduce the Redis updates, usually leads to memory state. We have multiple kubernetes pods as part of the same kubernetes deployment. Each of the pods keeps its own memory state, where all updates are done, and then once in a short period (for example once in 5 seconds), the state is saved back to the Redis. Why is this better? We usually increment the same counters and set value for the save Redis key for each work task. Instead, we can do this in memory, and only update the Redis once, while converting multiple increment operations to a single operation. This reduces the complexity of the Redis updates from O(N) where N in the work tasks number to ~O(time period).

The next step is reducing the updates even more. Part of the in-memory state that we keep in the pods is saved only for cases the the pod is terminated, and we need to reload the state. Do we really need to save it every 5 seconds? How critical would it be if we lose some of the updates? If some updates are not critical, we can save these using higher interval, for example once in 10 minutes. Notice that it is important to use random interval to prevent parallel save of all the pods exactly in the same time, and hence loading the Redis and slowing the system. An example of time interval calculation is:

10 minutes + random(10 minutes)

Final Note

In this post we've reviewed methods to reduce the load on Redis. 
We have reviewed state save methods. The same methods can be used also for load state. 
Using these methods should be the one of the first attempts to solve Redis stress issue, before jumping into conclusion that the Redis cluster should be upgraded to use more CPU and memory resource, and hence reducing costs in an effective way.









No comments:

Post a Comment