HTTP Requests Retry Strategy

Moath Obeidat
4 min readOct 5, 2023

Requests are vulnerable to failure regardless of how highly available both the source and destination hosts are.

One of the cases where there is a high risk of having failed requests is the integration between two systems.

When to retry a request?

The retry strategy is used to reduce failed requests caused by network issues, server errors, or crossing the rate limit of allowed requests per period of time.

Not all failed requests are worth trying, in the case that the status code is (500 Internal Server Error), which means the server is down usually the server will not be available in a few minutes.

Receiving status code (422 Unprocessable Entity) usually means that your request can’t be processed because there is a validation error, which means if you retry hundreds of times the result will be the same!

So we have to retry on specific failed cases to avoid overloading our server with thousands of unuseful requests.

It’s recommended to retry the failed request's specific status codes.

These are emailable status codes which is an awesome tool to validate emails.

Assuming that we have an application and we want to integrate with emailable, as mentioned before
“It’s recommended to retry the failed request’s specific status codes.” so let's specify which status codes we will retry if the request fails while trying to integrate with emailable:

(429 Too Many Requests),
(503 Service Unavailable),
(249 Try Again)

These status codes might be changed based on the third party that we want to integrate with, the third party might specify a different status code to “Try Again” for example, or might not have a request rate limit … etc.
So you have to specify which status codes to consider based on your case.

How Much Time To Wait Between Retries?

After we have specified which status codes we will consider to retry failed requests, it’s time to set the delay between retries.

While Implementing your retry strategy you have to add delay time between requests to avoid having more failed requests, the delay time must be determined carefully.

Determining the delay time is a bit sensitive and depends on the application requirements that you build.

For example, if the content of the request depends a lot on time which makes the time critical factor for the request, and the delay in retrying may cause the request payload to become outdated, in this case, you must reduce the delay period between attempts, like retry after 200 ms, the next retry after 400 ms, then after 600 ms, 800 and so on.
Reducing retry delay is to avoid making request payload outdated due to the nature of the application and business requirements.

Let's have an example where time is not a critical factor in the request,
in this case, the retry delay can be like this for example retry after 1s, 2s, 4s, 6s, and so on.

If you noticed in both examples, the delay time increases in an exponential way, actually it’s a known technique called “Exponential Backoff” which is in simple words:
“Technique that increases the waiting time between retries exponentially.”

This technique reduces the rate of requests and thereby protects against increased load.

Reducing the rate of requests helps protect against an increased load on both the source and destination hosts.

If you choose to use the “Exponential Backoff” technique you have to consider setting the max delay time to avoid reaching a long delay time.

For example, 20 seconds of delay time might not be acceptable, so make sure that you specify the delay max time between retries in a way that suits your application requirements.

How Many Times To Retry Before Stop?

The answer is that there is no perfect number of retries to attempt before stopping, but what matters most is to specify a number based on your application requirements.

If you do not specify the maximum number of allowed retries, the cost will be very high you will end up flooding your server with thousands of requests and losing your system availability.

So what’s next after stopping retries?
You might log failed requests after reaching the maximum number of retires and implement cronjob to collect failed requests and retry them later or use another way to deliver the data, taking into consideration that some failed requests data might become invalid after all this time and not worth the retry.

Conclusion

Building a solid retry strategy requires a combination of points to be taken into consideration, and these points are:

  • Choose the right status codes to retry on.
  • Choose the right technique to set the delay between retries.
  • Set maximum delay time between retries.
  • Set maximum retries number.

In addition to all these points, the application nature and requirements must also be taken into consideration.

--

--