Saturday 11 July 2020

Hystrix Circuit breaker in Java Spring

In this blog I will be dicussing about Hystrix as a circuit breaker in java. First of all we should understand what is circuit breaker and why to use it in a distributed environment.

Netflix's Hystrix library provides an implementation of the circuit breaker pattern. When you apply a circuit breaker to a method, Hystrix watches for failing calls to that method, and, if failures build up to a threshold, Hystrix opens the circuit so that subsequent calls automatically fail.


In this image you can see, few request got request timeout and then circuit is in open state. No further call is hit untill circuit is in half open state. When circuit is in open state then fallback response is sent. These terms will get clearer later in this blog.


Why to use circuit breaker:
  • Avoids overloading the unhealthy downstream service so that it can recover
  • It stops cascading failures across services in a distributed environment.
  • Helps to create a system that can survive gracefully when key services are either unavailable or have high latency.
  • It provides fallback options.

Now lets understand how to use Netflix's Hystrix in a Spring Boot project.

According to Hystrix documentation :
You should use HystrixCommand for blocking downstream service and use HystrixObservableCommand for non blocking downstream service.
HystrixCommand : for blocking I/O
HystrixObservableCommand : for non blocking I/O

In this blog we will be using HystrixCommand for blocking downstream service. Lets start configuring Hystrix in our Spring Boot project.

pom.xml changes

<parent>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-parent</artifactId>
	<version>2.1.2.RELEASE</version>
	<relativePath />
</parent>
<dependencies>
	<!--Hystrix dependencies -->
	<dependency>
		<groupId>org.springframework.cloud</groupId>
		<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
		<version>2.1.2.RELEASE</version>
        </dependency>
        <!--Other spring boot dependencies -->
</dependencies>

By adding above dependency you will be able to integrate hystrix circuit breaker. But make sure that you are using correct version of hystrix dependency with Spring Boot project.

Hystrix artifact for Spring boot versions :
       
Spring boot version 1.x.x -- spring-cloud-starter-hystrix
Spring boot version 2.x.x -- spring-cloud-starter-netflix-hystrix

Now you need to add a hook for enabling hystrix in your spring boot application.For that you need to add an annotation in your Main Spring boot configuration file :
@EnableCircuitBreaker

Below is a simple implementation of @HystrixCommand annotation :

@HystrixCommand(fallbackMethod = "getFallbackOfferApiResponse", commandKey = "offerCommandKey")
public String offerApi() {
 
    // actual rest call
    return "This is actual response"
}
 
private String getFallbackOfferApiResponse() {

    return "This is fallback Response";
}

In this example, there is an external downstream api named offerApi, which is an Http GET api with String response.
The @HystrixCommand annotation have two important keys
  • fallbackMethod : is the method that will be executed when your downstream service is not available.
  • commandKey : is used to uniquely identify hystrix configuration properties.
Below is the example of hystrix configuration properties for above command key :

hystrix.command.offerCommandKey.execution.isolation.strategy=THREAD
hystrix.command.offerCommandKey.execution.isolation.thread.timeoutInMilliseconds=5000
hystrix.command.offerCommandKey.execution.isolation.semaphore.maxConcurrentRequests=40
hystrix.command.offerCommandKey.fallback.isolation.semaphore.maxConcurrentRequests=40
hystrix.command.offerCommandKey.circuitBreaker.requestVolumeThreshold=4
hystrix.command.offerCommandKey.circuitBreaker.sleepWindowInMilliseconds=5000
hystrix.command.offerCommandKey.circuitBreaker.errorThresholdPercentage=50
hystrix.command.offerCommandKey.metrics.rollingStats.timeInMilliseconds=10000
hystrix.command.offerCommandKey.circuitBreaker.enabled=true
Note that all hystrix properties contain a word offerCommandKey i.e. command key of @HystrixCommand annotation. hystrix.command  is prefix of the command key and actual hystrix properties are the postfix of the command key. 
For example :
hystrix.command.<commandKey>.<propertyName>=<propertyValue>
This way you can have multiple hystrix configurations for multiple downstream apis.

Thats it. This is what you need to do for integrating hystrix in your code.But the main thing is you need to tune these Hystrix properties for proper behaviour of circuit breaker for your downstream service.

Now lets understand some important hystrix properties one by one.

execution.isolation.strategy

There are two types of isolation strategies :
  • THREAD : It executes on a separate thread and concurrent requests are limited by the number of threads in hystrix thread-pool
  • SEMAPHORE :  It executes on the calling thread and concurrent requests are limited by the semaphore count
Default value is THREAD


THREAD VS SEMAPHORE
  • For blocking I/O, use a thread-isolated HystrixCommand.
  • For nonblocking I/O, use a semaphore-isolated HystrixObservableCommand.
  • The only time you should use SEMAPHORE isolation for HystrixCommand to avoid the overhead of separate threads.
  • The advantage of the thread pool approach is that requests that are passed to application component can be timed out, something that is not possible when using semaphores.

execution.isolation.thread.timeoutInMilliseconds

Time in milliseconds after which the hystrix will timeout. This property will only works when isolation strategy is THREAD.
Default value is 1000.

Hystrix thread pool properties :
coreSize       (default = 10)
maximumSize    (default = 10)
maxQueueSize   (default = -1)

execution.isolation.semaphore.maxConcurrentRequests

Maximum number of requests allowed in HystrixCommand when using SEMAPHORE. If this maximum concurrent limit is hit then subsequent requests will be rejected.
Default value is 10.

fallback.isolation.semaphore.maxConcurrentRequests

Maximum number of fallback execution allowed in HystrixCommand when using SEMAPHORE. If the maximum concurrent limit is hit then subsequent requests will be rejected and an exception thrown since no fallback could be retrieved.
Default value is 10.

How to calculate number of THREADs in Thread pool or count of SEMAPHORE

Theoretical formula for calculating the size is:
Requests per second at peak when healthy × 99th percentile latency in seconds + some breathing room

Lets take an example :
Requests per second per instance at peak time = 60
99th % Latency = 200 ms = 0.2 seconds

Number of threads = 60 x 0.2 + 5 (some extra space) = 17

circuitBreaker.requestVolumeThreshold

Minimum number of requests in a rolling window that will trip the circuit.(explained in below example)
Default value is 20.

circuitBreaker.sleepWindowInMilliseconds

After tripping the circuit, the amount of time to reject requests before allowing attempts again to determine if the circuit should again be closed.
Default value is 5000.

circuitBreaker.errorThresholdPercentage

Error percentage at or above which the circuit should trip open and start short-circuiting requests to fallback logic. (explained in below example)
Default value is 50.

metrics.rollingStats.timeInMilliseconds

Duration of the statistical rolling window, in milliseconds. (explained in below example)
Default value is 10000.

How Hystrix trips circuit

Within a timespan of duration metrics.rollingStats.timeInMilliseconds, the percentage of actions resulting in a handled exception exceeds errorThresholdPercentage, provided also that the number of actions through the circuit in the timespan is at least requestVolumeThreshold.

Lets understand above statement with an example. Lets assume that below are hystrix configuration properties for our downstream api.

Rolling window time in ms = 5000
Error Threshold percentage = 50%
Request volume threshold = 4

 Request count Response Http Status Timestamp in sec Circuit Status
 1 200 1 Closed
 2 500 1.5 Closed
 3 500 2 Closed
 4 500 3 Closed
 5 500 4 Open
As you can see in above example, 3 requests in first 4 requests (volume threshold) resulted in 500 http status response, i.e. 75% of request failed (greater than error threshold percentage) and it is in a time window of (3-1 = 2) 2 seconds (less than rolling window time). As circuit breaker threshold exceeded, so circuit is open on 5th request. 

Another Example of Hystrix Circuit :

Rolling window time in ms = 5000
Error Threshold percentage = 50%
Request volume threshold = 4

 Request count Response Http Status Timestamp in sec Circuit Status
 1 200 1 Closed
 2 500 3 Closed
 3 500 5 Closed
 4 500 7 Closed
 5 500 9 Closed
In this example, first 4 requests took a time (7-1 = 6) 6 seconds (greater than rolling window time). Circuit breaker rule is not fulfilled here. So circuit is in closed state on 5th request. 

circuitBreaker.enabled

Whether a circuit breaker will be used to track health and to short-circuit requests if it trips.
Default value is true

Note : Hystrix fallback logic will always be executed if any unhandled exception comes in Annotated method.

for example :
status code 500 will throw HttpServerErrorException.InternalServerError
status code 400 will throw HttpClientErrorException.BadRequest

If you do not want to break circuit for 400 response code, use below logic :

@HystrixCommand(fallbackMethod = "getFallbackOfferApiResponse", commandKey = "offerCommandKey")
public String offerApi() {

	try {
    		// actual rest call
		return "This is actual response"
	} catch (HttpClientErrorException.BadRequest e) {
		// handle 400 response
        	return "This is 400 (Bad Request) response";
	}
}

You can also use HttpClientErrorException#getStatusCode() to handle any kind of 4XX series exceptions like UNAUTHORIZED(401), METHOD_NOT_ALLOWED(405), UNSUPPORTED_MEDIA_TYPE(415), TOO_MANY_REQUESTS(429) and many others.

Upto this point you understood how to use Hystrix as a circuit breaker in Spring Boot project. Now there are some other cool feature of hystrix that we must know.

First cool feature is to reload hystrix properties at runtime so that if you want to change/tune your hystrix properties you doesn't require to rebuild your whole application.

// Reload Hystrix properties at runtime
public void loadProperty(String propertyName, String propertyValue) {

	ConcurrentCompositeConfiguration config = (ConcurrentCompositeConfiguration) ConfigurationManager.getConfigInstance();
	config.setOverrideProperty(propertyName, propertyValue);
	Hystrix.reset();
	HystrixPlugins.reset();
}

You can put these properties in your database and can use pull based mechanism like a cron job which can update hystrix properties values dynamically at runtime.

Another cool feature of Hystrix is its Dashboard. You can use this dashboard for monitoring and debugging purpose.

For adding Hystrix dashboard, you need to add below dependencies in pom.xml file 

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
	<version>2.1.2.RELEASE</version>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

To enable Hystrix dashboard, add below annotation in your Main Spring boot configuration file:
@EnableHystrixDashboard

To enable the Hystrix metrics stream, add to application properties file :
management.endpoints.web.exposure.include=hystrix.stream

Hystrix Dashboard Home page example :

Hystrix Dashboard Closed circuit example : 

Hystrix Dashboard Open Circuit example : 


To enable hystrix logging at debug level, add below logger :

"name": "com.netflix.hystrix",
"level": "debug"

Code available on Git

Some References :

6 comments:

  1. Really very well written thanks

    ReplyDelete
  2. Usefull Information. Well described

    ReplyDelete
  3. nicely explained about circuit breaker.

    ReplyDelete
  4. Circuit breaker is one of crucial things in microservices and you explained it realy very well

    ReplyDelete
  5. All end points of circuit breaker in one blog, cheers

    ReplyDelete