In this blog I will be dicussing about Hystrix as a circuit breaker in java. First of all we should understand what is circuit breaker and why to use it in a distributed environment.
Netflix's Hystrix library provides an implementation of the circuit breaker pattern. When you apply a circuit breaker to a method, Hystrix watches for failing calls to that method, and, if failures build up to a threshold, Hystrix opens the circuit so that subsequent calls automatically fail.
In this image you can see, few request got request timeout and then circuit is in open state. No further call is hit untill circuit is in half open state. When circuit is in open state then fallback response is sent. These terms will get clearer later in this blog.
Why to use circuit breaker:
- Avoids overloading the unhealthy downstream service so that it can recover
- It stops cascading failures across services in a distributed environment.
- Helps to create a system that can survive gracefully when key services are either unavailable or have high latency.
- It provides fallback options.
Now lets understand how to use Netflix's Hystrix in a Spring Boot project.
According to Hystrix documentation :
You should use HystrixCommand for blocking downstream service and use HystrixObservableCommand for non blocking downstream service.
HystrixCommand : for blocking I/O
HystrixObservableCommand : for non blocking I/O
In this blog we will be using HystrixCommand for blocking downstream service. Lets start configuring Hystrix in our Spring Boot project.
pom.xml changes
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.2.RELEASE</version>
<relativePath />
</parent>
<dependencies>
<!--Hystrix dependencies -->
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
<version>2.1.2.RELEASE</version>
</dependency>
<!--Other spring boot dependencies -->
</dependencies>
By adding above dependency you will be able to integrate hystrix circuit breaker. But make sure that you are using correct version of hystrix dependency with Spring Boot project.
Hystrix artifact for Spring boot versions :
Spring boot version 1.x.x -- spring-cloud-starter-hystrix
Spring boot version 2.x.x -- spring-cloud-starter-netflix-hystrix
Now you need to add a hook for enabling hystrix in your spring boot application.For that you need to add an annotation in your Main Spring boot configuration file :
@EnableCircuitBreaker
Below is a simple implementation of @HystrixCommand annotation :
@HystrixCommand(fallbackMethod = "getFallbackOfferApiResponse", commandKey = "offerCommandKey")
public String offerApi() {
// actual rest call
return "This is actual response"
}
private String getFallbackOfferApiResponse() {
return "This is fallback Response";
}
In this example, there is an external downstream api named offerApi, which is an Http GET api with String response.
The @HystrixCommand annotation have two important keys
- fallbackMethod : is the method that will be executed when your downstream service is not available.
- commandKey : is used to uniquely identify hystrix configuration properties.
Below is the example of hystrix configuration properties for above command key :
hystrix.command.offerCommandKey.execution.isolation.strategy=THREAD hystrix.command.offerCommandKey.execution.isolation.thread.timeoutInMilliseconds=5000 hystrix.command.offerCommandKey.execution.isolation.semaphore.maxConcurrentRequests=40 hystrix.command.offerCommandKey.fallback.isolation.semaphore.maxConcurrentRequests=40 hystrix.command.offerCommandKey.circuitBreaker.requestVolumeThreshold=4 hystrix.command.offerCommandKey.circuitBreaker.sleepWindowInMilliseconds=5000 hystrix.command.offerCommandKey.circuitBreaker.errorThresholdPercentage=50 hystrix.command.offerCommandKey.metrics.rollingStats.timeInMilliseconds=10000 hystrix.command.offerCommandKey.circuitBreaker.enabled=true
Note that all hystrix properties contain a word offerCommandKey i.e. command key of @HystrixCommand annotation. hystrix.command is prefix of the command key and actual hystrix properties are the postfix of the command key.
For example :
hystrix.command.<commandKey>.<propertyName>=<propertyValue>
This way you can have multiple hystrix configurations for multiple downstream apis.
Thats it. This is what you need to do for integrating hystrix in your code.But the main thing is you need to tune these Hystrix properties for proper behaviour of circuit breaker for your downstream service.
Now lets understand some important hystrix properties one by one.
execution.isolation.strategy
There are two types of isolation strategies :
- THREAD : It executes on a separate thread and concurrent requests are limited by the number of threads in hystrix thread-pool
- SEMAPHORE : It executes on the calling thread and concurrent requests are limited by the semaphore count
Default value is THREAD
THREAD VS SEMAPHORE
- For blocking I/O, use a thread-isolated HystrixCommand.
- For nonblocking I/O, use a semaphore-isolated HystrixObservableCommand.
- The only time you should use SEMAPHORE isolation for HystrixCommand to avoid the overhead of separate threads.
- The advantage of the thread pool approach is that requests that are passed to application component can be timed out, something that is not possible when using semaphores.
execution.isolation.thread.timeoutInMilliseconds
Time in milliseconds after which the hystrix will timeout. This property will only works when isolation strategy is THREAD.
Default value is 1000.
Hystrix thread pool properties :
coreSize (default = 10)
maximumSize (default = 10)
maxQueueSize (default = -1)
execution.isolation.semaphore.maxConcurrentRequests
Maximum number of requests allowed in HystrixCommand when using SEMAPHORE. If this maximum concurrent limit is hit then subsequent requests will be rejected.
Default value is 10.
fallback.isolation.semaphore.maxConcurrentRequests
Maximum number of fallback execution allowed in HystrixCommand when using SEMAPHORE. If the maximum concurrent limit is hit then subsequent requests will be rejected and an exception thrown since no fallback could be retrieved.
Default value is 10.
How to calculate number of THREADs in Thread pool or count of SEMAPHORE
Theoretical formula for calculating the size is:
Requests per second at peak when healthy × 99th percentile latency in seconds + some breathing room
Lets take an example :
Requests per second per instance at peak time = 60
99th % Latency = 200 ms = 0.2 seconds
Number of threads = 60 x 0.2 + 5 (some extra space) = 17
circuitBreaker.requestVolumeThreshold
Minimum number of requests in a rolling window that will trip the circuit.(explained in below example)
Default value is 20.
circuitBreaker.sleepWindowInMilliseconds
After tripping the circuit, the amount of time to reject requests before allowing attempts again to determine if the circuit should again be closed.
Default value is 5000.
circuitBreaker.errorThresholdPercentage
Error percentage at or above which the circuit should trip open and start short-circuiting requests to fallback logic. (explained in below example)
Default value is 50.
metrics.rollingStats.timeInMilliseconds
Duration of the statistical rolling window, in milliseconds. (explained in below example)
Default value is 10000.
How Hystrix trips circuit
Within a timespan of duration metrics.rollingStats.timeInMilliseconds, the percentage of actions resulting in a handled exception exceeds errorThresholdPercentage, provided also that the number of actions through the circuit in the timespan is at least requestVolumeThreshold.
Lets understand above statement with an example. Lets assume that below are hystrix configuration properties for our downstream api.
Rolling window time in ms = 5000
Error Threshold percentage = 50%
Request volume threshold = 4
Request count | Response Http Status | Timestamp in sec | Circuit Status |
1 | 200 | 1 | Closed |
2 | 500 | 1.5 | Closed |
3 | 500 | 2 | Closed |
4 | 500 | 3 | Closed |
5 | 500 | 4 | Open |
Another Example of Hystrix Circuit :
Rolling window time in ms = 5000
Error Threshold percentage = 50%
Request volume threshold = 4
Request count | Response Http Status | Timestamp in sec | Circuit Status |
1 | 200 | 1 | Closed |
2 | 500 | 3 | Closed |
3 | 500 | 5 | Closed |
4 | 500 | 7 | Closed |
5 | 500 | 9 | Closed |
In this example, first 4 requests took a time (7-1 = 6) 6 seconds (greater than rolling window time). Circuit breaker rule is not fulfilled here. So circuit is in closed state on 5th request.
circuitBreaker.enabled
Whether a circuit breaker will be used to track health and to short-circuit requests if it trips.
Default value is true
Note : Hystrix fallback logic will always be executed if any unhandled exception comes in Annotated method.
for example :
status code 500 will throw HttpServerErrorException.InternalServerError
status code 400 will throw HttpClientErrorException.BadRequest
If you do not want to break circuit for 400 response code, use below logic :
@HystrixCommand(fallbackMethod = "getFallbackOfferApiResponse", commandKey = "offerCommandKey")
public String offerApi() {
try {
// actual rest call
return "This is actual response"
} catch (HttpClientErrorException.BadRequest e) {
// handle 400 response
return "This is 400 (Bad Request) response";
}
}
You can also use HttpClientErrorException#getStatusCode() to handle any kind of 4XX series exceptions like UNAUTHORIZED(401), METHOD_NOT_ALLOWED(405), UNSUPPORTED_MEDIA_TYPE(415), TOO_MANY_REQUESTS(429) and many others.
Upto this point you understood how to use Hystrix as a circuit breaker in Spring Boot project. Now there are some other cool feature of hystrix that we must know.
First cool feature is to reload hystrix properties at runtime so that if you want to change/tune your hystrix properties you doesn't require to rebuild your whole application.
// Reload Hystrix properties at runtime
public void loadProperty(String propertyName, String propertyValue) {
ConcurrentCompositeConfiguration config = (ConcurrentCompositeConfiguration) ConfigurationManager.getConfigInstance();
config.setOverrideProperty(propertyName, propertyValue);
Hystrix.reset();
HystrixPlugins.reset();
}
You can put these properties in your database and can use pull based mechanism like a cron job which can update hystrix properties values dynamically at runtime.
Another cool feature of Hystrix is its Dashboard. You can use this dashboard for monitoring and debugging purpose.
For adding Hystrix dashboard, you need to add below dependencies in pom.xml file
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix-dashboard</artifactId>
<version>2.1.2.RELEASE</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
To enable Hystrix dashboard, add below annotation in your Main Spring boot configuration file:
@EnableHystrixDashboard
To enable the Hystrix metrics stream, add to application properties file :
management.endpoints.web.exposure.include=hystrix.stream
Hystrix Dashboard Home page example :
Hystrix Dashboard Closed circuit example :
Hystrix Dashboard Open Circuit example :
To enable hystrix logging at debug level, add below logger :
"name": "com.netflix.hystrix",
"level": "debug"
Code available on Git
Some References :
�� �� wow
ReplyDeleteReally very well written thanks
ReplyDeleteUsefull Information. Well described
ReplyDeletenicely explained about circuit breaker.
ReplyDeleteCircuit breaker is one of crucial things in microservices and you explained it realy very well
ReplyDeleteAll end points of circuit breaker in one blog, cheers
ReplyDelete