Rate limiter is one of the most important aspects when designing a microservice system. It is used to limit the usage of an api upto a particular threshold. When an api exceeds its TPS (Transactions Per Second) threshold limit, it should ignore those extra requests to save all downstream services and should send proper http status to its client.
Let's take an example to understand this. Let's say there is an api whose maximum TPS it can handle is 500 requests per second. If we apply a rate limiter on this api upto 500 request per second and we experience 700 request per second in a particular time window, then 200 extra requests will be ignored with an http status 429 i.e. TOO_MANY_REQUEST. Before going into detail of how to implement a Rate limiter in a distributed environment, let's understand various advantages of using it.