Retry after errors, with exponential backup (in Ruby)
There are situations where some errors can occur. Let’s say you connect to a remote service, like a database or an API over HTTP. An error raised by your client is not always permanent. It might be a network glitch or something else.
Here is an attempt (in Ruby) to retry on error, with a longer sleep time between attempts.
class WhateverException < StandardError; end
debug_counter = 0
sleep_times = [0.1, 0.2, 0.5, 1]
begin
fail WhateverException, "counter=#{debug_counter += 1}"
rescue WhateverException
if time = sleep_times[(nb_retries ||= 0)]
sleep time
puts "retry #{nb_retries} after #{time}s"
nb_retries += 1
retry
else
raise
end
end
The 2 first lines are just context ; an exception class and a counter for debugging purposes.
sleep_times = [0.1, 0.2, 0.5, 1]
is an array of times in seconds that I want to wait at each attempt.
The begin
/rescue
block allow to rescue the exception when it occurs, but also the retry
(see later).
When an expected exception occurs, Ruby executes the body of the rescue
part. It takes the first sleep time, wait that long, puts a debug line of text (that you’ll want to remove or change to an audit log message), increments the number of attempts and executes the retry
statement.
A retry
statement rolls back to the previous begin
block and executes it again, without any condition. That’s why we have to deal with a maximum number of attempts or it will loop forever.
If we reach the end of the sleep_times
array of times, Ruby will return nil
and the if
condition will fail. The original exception is raised again, as is.
Here is the output of this “script” :
ruby ~/tmp/retry.rb
retry 0 after 0.1s
retry 1 after 0.2s
retry 2 after 0.5s
retry 3 after 1s
/Users/jlecour/tmp/retry.rb:6:in `': counter: 5 (WhateverException)
Remember that in Ruby raise
and fail
are exactly the same method, but as Jim Weirich was saying :
Because I use exceptions to indicate failures, I almost always use the “fail” keyword rather than the “raise” keyword in Ruby. Fail and raise are synonyms so there is no difference except that “fail” more clearly communcates that the method has failed. The only time I use “raise” is when I am catching an exception and re-raising it, because here I’m not failing, but explicitly and purposefully raising an exception.