A fuse is like a fuse. When there is a problem with the services we depend on, we can be fault tolerant in time. On the one hand, it can reduce the dependence of dependent services on their own access and prevent avalanche effect; on the other hand, it can reduce the request frequency to facilitate the upstream recovery of services as soon as possible.
Fuse is also widely used. In addition to our application, in order to request services using fuses, in web Gateway, microservice, there are also very extensive applications. In this paper, we will learn a fuse implementation of github/sony/gobreaker from the source point of view. (code comments can be viewed from github/lpflpf/gobreaker)
Fuse mode
gobreaker is a Golang implementation based on the fuse pattern in Microsoft cloud design pattern. sony company is open-source, and the current star number is 1.2K. The number of users is large.
Here is a state machine defined by the pattern:
There are three states of fuse, and four states are transferred
Three states:
- Fuse off status, normal service access
- Fuse open status, abnormal service
- Fuse half open, partial request for current limiting access
Four state transitions:
- When the fuse is closed, when it fails and meets certain conditions, it will be directly transferred to the fuse open state.
- In the fuse open state, if the specified time has passed, it will enter the semi open state to verify whether the current service is available.
- When the fuse is half open, if it fails, it will enter the closed state again.
- After the fuse is half opened, all requests (with limit) are successful, then the fuse is closed. All requests will be accessed normally.
The implementation of gobreaker
gobreaker is a fuse based on the above state machine.
Definition of fuse
type CircuitBreaker struct { name string maxRequests uint32 // Maximum number of requests (half open will limit current) interval time.Duration // Statistical period timeout time.Duration // Timeout after entering the fuse readyToTrip func(counts Counts) bool // Judge whether to open the fuse through Counts. Custom required onStateChange func(name string, from State, to State) // Hook function in state modification mutex sync.Mutex // Mutex lock, the following data updates need to be locked state State // Current status recorded generation uint64 // Which period does the tag belong to counts Counts // Counter, counting success, failure, continuous success, continuous failure, etc., used to decide whether to enter the fuse expiry time.Time // Time to enter the next cycle }
Among them, the following parameters can be customized:
- MaxRequests: the maximum number of requests. When the maximum number of requests are normal, the fuse will be closed
- interval: a normal statistical cycle. If 0, the count will be cleared every time
- timeout: the time that can be requested again after entering the fuse
- readyToTrip: a hook function to judge the effectiveness of fusing
- onStateChagne: hook function of state change
Execution of request
The execution of fuse mainly includes three stages: ① judgment before request; ② execution of service request; ③ update of status and count after request
// Call of fuse func (cb *CircuitBreaker) Execute(req func() (interface{}, error)) (interface{}, error) { // ① Judgment before request generation, err := cb.beforeRequest() if err != nil { return nil, err } defer func() { e := recover() if e != nil { // ③ panic capture cb.afterRequest(generation, false) panic(e) } }() // ② Request and execution result, err := req() // ③ Update count cb.afterRequest(generation, err == nil) return result, err }
Decision before request
The status of the current fuse is determined before the request. If the fuse is turned on, the request will not continue. If the fuse is half open and the maximum request threshold has been reached, the request will not continue.
func (cb *CircuitBreaker) beforeRequest() (uint64, error) { cb.mutex.Lock() defer cb.mutex.Unlock() now := time.Now() state, generation := cb.currentState(now) if state == StateOpen { // Fuse open, direct return return generation, ErrOpenState } else if state == StateHalfOpen && cb.counts.Requests >= cb.maxRequests { // If it is semi open and there are too many requests, it will return directly return generation, ErrTooManyRequests } cb.counts.onRequest() return generation, nil }
The calculation of the current state is based on the current state. If the current status is enabled, judge whether it has timed out, and the timeout can change the status to half open; if the current status is closed, judge whether to enter the next cycle through the cycle.
func (cb *CircuitBreaker) currentState(now time.Time) (State, uint64) { switch cb.state { case StateClosed: if !cb.expiry.IsZero() && cb.expiry.Before(now) { // Need to enter the next counting cycle cb.toNewGeneration(now) } case StateOpen: if cb.expiry.Before(now) { // Fuse changed from open to half open cb.setState(StateHalfOpen, now) } } return cb.state, cb.generation }
The cycle length is also set according to the current state. If the current fuse is normal (fuse closed), it is set to an interval period; if the current fuse is open, it is set to a timeout (it can be changed to half open state after timeout).
Processing after request
After each request, the fuse will be counted by whether the request result is successful or not.
func (cb *CircuitBreaker) afterRequest(before uint64, success bool) { cb.mutex.Lock() defer cb.mutex.Unlock() now := time.Now() // If it is not in a cycle, it will not be counted state, generation := cb.currentState(now) if generation != before { return } if success { cb.onSuccess(state, now) } else { cb.onFailure(state, now) } }
If half open:
- If the request is successful, it will be judged that the current number of consecutive successful requests is greater than or equal to maxRequests, and the status can be changed from half open to closed
- If the request fails in the half open state, the half open state will be directly transferred to the open state
If off:
- Count update if request is successful
- If the request fails, call readyToTrip to determine whether it is necessary to turn the status off to on
summary
- For frequent requests for remote or third-party unreliable services, the probability of failure is very high. The advantage of using fuse is that our own service can not be dragged down by these unreliable services, resulting in an avalanche.
- Because the fuse will not only maintain a lot of statistical data, but also do resource isolation with mutually exclusive locks, which will cost a lot.
- In the half open state, there may be too many requests. This is because the number of successful consecutive requests in the half open state does not reach the maximum request value. Therefore, the fuse may cause a large number of too many requests errors for services with long (but frequent) request times
- Microsoft cloud design mode( https://www.microsoft.com/en-...