greenTechJava: Design and Implement Rate Limiter

Rate limiter is one of the most important aspects when designing a microservice system. It is used to limit the usage of an api upto a particular threshold. When an api exceeds its TPS (Transactions Per Second) threshold limit, it should ignore those extra requests to save all downstream services and should send proper http status to its client.

Let's take an example to understand this. Let's say there is an api whose maximum TPS it can handle is 500 requests per second. If we apply a rate limiter on this api upto 500 request per second and we experience 700 request per second in a particular time window, then 200 extra requests will be ignored with an http status 429 i.e. TOO_MANY_REQUEST. Before going into detail of how to implement a Rate limiter in a distributed environment, let's understand various advantages of using it.

Advantages of using a Rate limiter

(1) Saving downstream Resources : You want to limit the usage of an api so that your downstream resources stay healthy and do not exhaust on high TPS. Every api has some maximum threshold limit of TPS that can bear without any system resource failure.

(2) Saving downstream microservices : Rate limiter will help in avoiding cascading effect on downstream microservices. A service might be calling other microservices for pulling data and heavy load on a single service will have high load on its downstream services also.

(3) For monetary purposes : You want your api to be used as a paid api. Then you can have separate limits of free version and paid version. For example : 200 requests in 1 minute for free and after that it will be paid.

(4) Limiting offer burn : There are some apis which provide coupons or promocodes or cashback offers to users. These kind of apis should have a Rate limiter so that even if you have a bug in your system, you could avoid the state of heavy cash burn.

(5) Disadvantage of not using Rate limiter : If one of your api exceeds its threshold TPS, more system resources like server cpu, memory and jvm threads will be consumed to serve extra requests. For example thread timeouts on your downstream database (like mysql, cassandra or elasticSearch) will increase jvm thread usage count as a result CPU consumption will increase. So new requests will not be served and this may cause ELB 5xx. This will downgrade the performance of your own service. Also it will have a cascading effect on upstream services or clients as well.

Rate Limiter Filter on Api Endpoint

As you can see in the diagram above, rate limiter logic can be implemented in the J2EE filter layer. All requests which exceed a particular threshold in a particular rolling time window are filtered and an http 429 response status is returned from the filter layer itself.

Concept behind Rate limiter :

One url in a service can have many rate limiter configurations. So this is a one to many relationship, which can either be stored in memory or in a database table.

For example url = “/v1/product” may have below rate limiter configs
[
{2 sec window : 500 threshold requests },
{10 sec window : 2,000 threshold requests },
{60 sec window : 10,000 threshold requests }
]

So this is the initial config we need to create for a service. Then we need to collect request count data in such a way that we can validate our config with it.

Redis provides a data structure that fits for this problem : sorted set. It provides O(1) and O(log n) time operations to solve our problem.

Rate limiter Algorithm :

                
1 For all RL configs repeat
  1.1 remove all elements in redis sorted set before rolling window time (ZREMRANGEBYSCORE) - O(log n)
  1.2 count remaining elements in sorted set (ZCARD) - O(1)
  1.3 check if count is less than allowed threshold - O(1)
  1.4 add new element (ZADD) - O(log n)
2 If any condition of count check failed, return false else return true

Please note that even if any threshold condition fails, we need to add all keys to the redis sorted set.

Let’s understand how request data is stored in Redis sorted set with an example :

api url = /v1/product
and its configs are 
[
  {2 sec, 500 threshold},
  {10 sec, 2000 threshold}
]

So 2 keys will be created in the redis sorted set as we have 2 rate limiter configs and there will be 4 values in the sorted set upon 4 consecutive requests. Something like :

Key1 = {/v1/product}-2-500
Value1 = 
1627563794091 : 1627563794091
1627563865158 : 1627563865158
1627563866219 : 1627563866219
1627563868191 : 1627563868191


Key2 = {/v1/product}-10-2000
Value2 = 
1627563794091 : 1627563794091
1627563865158 : 1627563865158
1627563866219 : 1627563866219
1627563868191 : 1627563868191

Value1 and Value2 represent timestamps (time of epoch) of request.

How Sliding window Rate limiter Algorithm works :

Let's understand how sliding window Rate limiter counts exceeded requests but still adds data (i.e. request epoch time) to redis sorted set. Let's take another example with below config :


{threshold = 5 , window = 60 seconds}

Below are request number and their time of impact in hh:mm:ss format
// count shows number of entries in redis sorted set.
// Even after threshold exceed, request time is saved in redis

Request 1  => 09:30:20, count = 1, status = 200
Request 2  => 09:30:25, count = 2, status = 200
Request 3  => 09:30:50, count = 3, status = 200
Request 4  => 09:31:10, count = 4, status = 200

=> New Request (5)
Request 2  => 09:30:25
Request 3  => 09:30:50
Request 4  => 09:31:10
Request 5  => 09:31:25 , count = 4, status = 200
// Request 1 removed from redis

=> New Request (6)
Request 3  => 09:30:50
Request 4  => 09:31:10
Request 5  => 09:31:22
Request 6  => 09:31:45 , count = 4, status = 200
// Request 2 removed from redis

=> New Request (7)
Request 3  => 09:30:50
Request 4  => 09:31:10
Request 5  => 09:31:22
Request 6  => 09:31:45
Request 7  => 09:31:48 , count = 5, status = 200

=> New Request (8)
Request 4  => 09:31:10
Request 5  => 09:31:22
Request 6  => 09:31:45
Request 7  => 09:31:48
Request 8  => 09:32:05 , count = 5, status = 200
// Request 3 removed from redis

=> New Request (9)
Request 4  => 09:31:10
Request 5  => 09:31:22
Request 6  => 09:31:45
Request 7  => 09:31:48
Request 8  => 09:32:05
Request 9  => 09:32:09 , count = 6, status = 429

=> New Request (10)
Request 5  => 09:31:22
Request 6  => 09:31:45
Request 7  => 09:31:48
Request 8  => 09:32:05
Request 9  => 09:32:09 , count = 6 , status = 429
Request 10 => 09:32:15 , count = 6 , status = 429
// Request 4 removed from redis

=> New Request (11)
Request 7  => 09:31:48
Request 8  => 09:32:05
Request 9  => 09:32:09
Request 10 => 09:32:15 , count = 6 , status = 429
Request 11 => 09:32:46 , count = 5 , status = 200
// Request 5 and 6 removed from redis

In above example of sliding window Rate limiter, both Request 9 and Request 10 has 6 request count in a sliding window of 1 minute.

Using Lua script over a Redis Cluster :

For optimisation of redis queries over for loop (loop over all RL configs) we can also take advantage of a lua script, which will help us to minimise network bandwidth. Redis provides a fault tolerant , scalable solution as Redis cluster where hash sets of keys are equally partitioned over all master nodes.

Also note that when you execute multiple keys from a single lua script, all must execute on a single node in a cluster. When the key for lua script is "{key}mykey", then "key" is used for finding a single node in a cluster by using its hash. That is why the url of api is enclosed within { and } . Please go through this link to understand more about this.

Rate Limiter in a Distributed Environment

As you can see in the above diagram, a distributed solution can be implemented via using the Redis cluster. A request with a given user or a fixed set of request parameters may fall on any server, but their cache key will be stored on the same redis node in a redis cluster. This provides a distributed, scalable and fault tolerant solution to the problem.

Spring Boot implementation

Now Let's analyse code changes required for the above system.

Add below dependency to your pom.xml file.

pom.xml


<dependency>
	<groupid>redis.clients</groupid>
	<artifactid>jedis</artifactid>
	<version>2.9.0</version>
</dependency>

RLConfig and RLThreshold are two DTOs which contain rate limiter config data (threshold and window).

RLConfig.java


public class RLConfig {

  private String clientId;
  private List<RLThreshold> rlThresholds = new ArrayList<>();

  public RLConfig(String clientId, List<RLThreshold> rlThresholds) {
    this.clientId = clientId;
    this.rlThresholds = rlThresholds;
    for (RLThreshold rlThreshold : rlThresholds) {
      rlThreshold.setKey(clientId);
    }
  }

  public String getClientId() {
    return clientId;
  }

  public List<RLThreshold> getRlThresholds() {
    return Collections.unmodifiableList(rlThresholds);
  }

  @Override
  public String toString() {
    return "RLConfig{" + "clientId='" + clientId + '\'' + ", rlThresholds=" + rlThresholds + '}';
  }
}

RLThreshold.java


public class RLThreshold {

  // time unit will be in seconds
  private Integer timeWindow;
  private Integer threshold;
  private String key;
  private Long lastWindowTime;
  private String epoch;

  public RLThreshold(Integer timeWindow, Integer threshold) {
    this.timeWindow = timeWindow;
    this.threshold = threshold;
    lastWindowTime = System.currentTimeMillis() - timeWindow.longValue();
    epoch = String.valueOf(System.currentTimeMillis());
  }

  public void setKey(String clientId) {
    if (key == null) {
      key = Utils.getClusterKey(clientId) + "-" + timeWindow + "-" + threshold;
    }
  }

  public Integer getTimeWindow() {
    return timeWindow;
  }

  public Integer getThreshold() {
    return threshold;
  }

  public String getKey() {
    return key;
  }

  public Long getLastWindowTime() {
    return lastWindowTime;
  }

  public String getEpoch() {
    return epoch;
  }

  @Override
  public String toString() {
    return "RLThreshold{"
        + "timeWindow="
        + timeWindow
        + ", threshold="
        + threshold
        + ", key='"
        + key
        + '\''
        + ", lastWindowTime="
        + lastWindowTime
        + ", epoch='"
        + epoch
        + '\''
        + '}';
  }
}

RateLimiterFilter is spring filter which actually creates an RLConfig object (having rate limiter config) on every request and validates it Redis data store.

RateLimiterFilter.java


@Component
public class RateLimiterFilter extends OncePerRequestFilter {

  private static final Logger LOGGER = LogManager.getLogger(RateLimiterFilter.class);

  @Autowired private RateLimiterService rateLimiterService;

  @Override
  protected void doFilterInternal(
      HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
      throws ServletException, IOException {

    String requestUri = request.getRequestURI();
    LOGGER.info("requestUri : {}", requestUri);
    RLConfig rlConfig = null;
    if (Route.V1_PRODUCT.equalsIgnoreCase(requestUri)) {
      rlConfig = getV1ProductRlConfig();
    } else if (Route.V2_PRODUCT.equalsIgnoreCase(requestUri)) {
      rlConfig = getV2ProductRlConfig();
    }

    boolean allowed = rateLimiterService.isAllowed(rlConfig);
    LOGGER.info("allowed : {}", allowed);
    if (!allowed) {
      response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
      return;
    }
    filterChain.doFilter(request, response);
  }

  // TODO : values can be fetched from cache or stored in memory
  private RLConfig getV1ProductRlConfig() {
    RLThreshold threshold1 = new RLThreshold(2, 500);
    RLThreshold threshold2 = new RLThreshold(5, 1000);
    List<RLThreshold> rlThresholds = new ArrayList<>();
    rlThresholds.add(threshold1);
    rlThresholds.add(threshold2);
    return new RLConfig(Route.V1_PRODUCT, rlThresholds);
  }

  private RLConfig getV2ProductRlConfig() {
    RLThreshold threshold1 = new RLThreshold(3, 400);
    RLThreshold threshold2 = new RLThreshold(10, 1100);
    List<RLThreshold> rlThresholds = new ArrayList<>();
    rlThresholds.add(threshold1);
    rlThresholds.add(threshold2);
    return new RLConfig(Route.V2_PRODUCT, rlThresholds);
  }
}

RateLimiterService is a spring bean which creates a java redis client on application startup. On every single request it executes a lua script with RLConfig data.

RateLimiterService.java


@Service
public class RateLimiterService {

  private static final Logger LOGGER = LogManager.getLogger(RateLimiterService.class);

  @Value("${redisHosts:localhost:6379,localhost:6380,localhost:6381}")
  private String redisHosts;

  private JedisCluster jedisClient;
  private ObjectMapper MAPPER = new ObjectMapper();
  private String luaScript;

  @PostConstruct
  public void init() {

    // create jedis client
    Set<HostAndPort> jedisNodes = new HashSet();
    String[] redisHostSplit = redisHosts.split(",");
    for (String redisHostPort : redisHostSplit) {
      String hostAndPort[] = redisHostPort.split(":");
      jedisNodes.add(new HostAndPort(hostAndPort[0], Integer.parseInt(hostAndPort[1])));
    }
    jedisClient = new JedisCluster(jedisNodes, new GenericObjectPoolConfig());
    LOGGER.info("jedisClient : {}", jedisClient);

    // load lua script
    InputStream inputStream =
        this.getClass().getClassLoader().getResourceAsStream("RlValidator.lua");
    if (inputStream != null) {
      BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
      luaScript = (String) reader.lines().collect(Collectors.joining(System.lineSeparator()));
      LOGGER.info("luaScript : {}", luaScript);
    } else {
      throw new RuntimeException("Unable to load lua script");
    }
  }

  // when key for lua script is "{key}mykey", then key is used
  // for finding single node in a cluster by using its hash
  // https://stackoverflow.com/questions/49622787/lua-script-attempted-to-access-a-non-local-key-in-a-cluster-node
  public boolean isAllowed(RLConfig rlConfig) throws JsonProcessingException {

    if (rlConfig != null) {
      String configJson = MAPPER.writeValueAsString(rlConfig);
      String shaKey = Utils.getClusterKey(rlConfig.getClientId());
      LOGGER.info("shaKey : {} , configJson : {}", shaKey, configJson);
      Long count =
          (Long)
              jedisClient.eval(
                  luaScript,
                  Collections.singletonList(shaKey),
                  Collections.singletonList(configJson));
      LOGGER.info("count : {}", count);
      if (count == 1) {
        return true;
      }
      return false;
    }
    return true;
  }
}

Below is the Utils file which pre append "{" and post append "}" to the url.

For example : {/v1/product}

Utils.java


public class Utils {

  public static String getClusterKey(String key) {
    return "{" + key + "}";
  }
}

Below is the code for Redis lua script having Rate limiter Algorithm.

RlValidator.lua


local function rlAllowed(rlConf)

    local rlThresholds = rlConf.rlThresholds
    local passRateLimiter = 1

    for i = 1, table.getn(rlThresholds) do

        -- Removes all elements in the sorted set stored at key with a score , O(log N)
        redis.call('ZREMRANGEBYSCORE', rlThresholds[i].key, '-inf', rlThresholds[i].lastWindowTime)
        
        -- Returns number of elements in a sorted set, O(1)
        local countOfElements = redis.call('ZCARD', rlThresholds[i].key)

        local rateLimiterResult = countOfElements < tonumber(rlThresholds[i].threshold)
        if (rateLimiterResult == false) then
            -- add key to redis, even if false
            passRateLimiter = 0
        end

        -- add new element in sorted set, O(log N)
        redis.call('ZADD', rlThresholds[i].key, rlThresholds[i].epoch, rlThresholds[i].epoch)
    end

    return passRateLimiter
end

local rlConf = cjson.decode(ARGV[1])
return rlAllowed(rlConf)

Full source code for above implementation is available here.

Reference links :

how-to-implement-rate-limiting-using-redis

rolling-rate-limiter

sliding-rate-limiter-with-redis

lua-script-to-access-non-local-key-redis-cluster

I hope you find this information helpful. Please post your comments for any improvements.