Spring Apache Http Client Socket Read Timeout

março 02, 2022 Postar um comentário

Explaining with pictures what connexion timeout, read timeout and connection pool timeout are, and how Apache HTTP Client compares to Asynchronous HTTP client when handling them

I recently had to introduce a colleague to the wonderful and exciting world of timeouts in Apache HttpClient. Equally the usual explanation that "the connection timeout is the maximum time to establish a connection to the server" is not the almost descriptive 1, let'due south attempt to explain with a couple of pictures what each timeout actually means.

Even if we will be talking most Apache'due south HttpClient, the post-obit explanation is useful for any TCP based communication, which includes most of the JDBC drivers.

As a reference, here are all the timeouts that you must configure if yous want a salubrious production service:

Connexion Timeout
Read Timeout

If you lot are using microservices, y'all will also need to configure a connection pool and the following timeouts:

Connexion Pool Timeout
Connection Puddle Fourth dimension To Alive (TTL)

You will observe here how to configure these timeout outs in Java. In our examples nosotros will utilise clj-http which is a simple wrapper over Apache's HttpClient. We will likewise compare how timeouts work in Asynchronous HTTP Client.

All the code, including a docker compose environment to test the settings tin be found at https://github.com/dlebrero/apache-httpclient-timeouts.

Connection timeout

Earlier your http client can start interchanging information with the server, a communication path (or road or pipe) between the two must be established.

This is washed with a handshake:

phone handshake

Afterwards this interchange you and your partner tin can get-go a conversation, that is, exchange data.

In TCP terms, this is called the iii way handshake:

TCP handshake

The connection timeout controls how long are you willing for this handshake to take.

Allow'south test information technology using a non-routable IP accost:

          ;; Without connectedness timeout (time     (endeavour       (client/get "http://10.255.255.1:22220/")       (catch Exception east))) "Elapsed fourth dimension: 75194.7148 msecs"  ;; With connexion timeout (time     (endeavor       (client/become "http://ten.255.255.ane:22220/"         {:connection-timeout 2000})       (grab Exception e         (log/info (.getClass e) ":" (.getMessage e))))) "Elapsed time: 2021.1883 msecs" INFO  a.http-client - java.net.SocketTimeoutException : connect timed out

Find the different elapsed time, the exception printed and the bulletin inside the exception.

Read timeout

In one case the connexion is established, and yous are happily talking with the server, you can specify how long you are willing to wait to hear back from the server, using the read timeout:

Apache Http Client read timeout

Let'south test it, using this fourth dimension effectually an Nginx server, with a Toxiproxy in the middle to mess around with the response times:

                      
;; With no socket timeout (time   (try     (client/get "http://local.toxiproxy:22220/")     (catch Exception e (.printStackTrace e)))) "Elapsed time: 240146.6273 msecs"  ;; Same url, with socket timeout (time     (try       (client/get "http://local.toxiproxy:22220/"         {:socket-timeout 2000})       (catch Exception e         (log/info (.getClass e) ":" (.getMessage due east))))) "Elapsed time: 2017.7835 msecs" INFO  a.http-client - java.internet.SocketTimeoutException : Read timed out

Note that the default socket timeout is arrangement dependant. Notice the different elapsed fourth dimension, the exception printed and the message within the exception.

The ToxiProxy configuration can exist institute hither.

Pub quiz

With these two timeouts, you should easily score one point for your squad on your next It Pub Quiz Championship:

If you configure your HTTP client with a 10 seconds connection timeout and a ane 2d read timeout, how long is a thread going to get stuck afterward issuing an HTTP asking, in the worst case scenario?

You guess it right! Space! One point for your team!

Whoot? You did not answer infinite? It is soooo obvious (sarcasm).

Allow'due south again call one of your friends and ask him almost Pi, but this time nosotros are going to phone call one of those high precision smartass friends:

Apache http client read timeout retry

What is going on?

If y'all read advisedly the previous explanation about the read timeout or even ameliorate, the javadoc most it you will detect that the read timeout is reset each time nosotros hear from the server, then if the response is too big, the connection is too deadening, the server is choking, or anything betwixt the client and the server is having trouble, your client thread volition exist there hanging for a very long time.

Allow'southward come across information technology in action. First we configure Toxiproxy to be very very slow while proxying the Nginx response (~ 2 bytes per second):

          (client/mail "http://local.toxiproxy:8474/proxies/proxied.nginx/toxics"     {:form-params {:attributes {:delay 1000000                                 :size_variation one                                 :average_size 2}                    :toxicity one.0                    :stream "downstream"                    :type "slicer"}      :content-type :json})

And at present we make exactly the aforementioned request as before, with a 2 seconds timeout:

          (fourth dimension     (try       (client/go "http://local.toxiproxy:22220/"         {:socket-timeout 2000})       (catch Exception e         (log/info (.getClass due east) ":" (.getMessage e))))) "Elapsed time: 310611.8366 msecs"

That is more than five minutes! And thankfully it is just 600 bytes.

Here is how the HttpClient logs look like, for simply reading the first bytes of the header:

Apache http client slow logs

That looks pretty slow. Of course, this will never always happen to you (more sarcasm hither).

We volition see at the bottom how to avoid this event.

Connectedness Puddle

Earlier talking nigh what the connection pool timeout is, let's see what is the point almost having a connection puddle with an example.

Let's say that there are two Stock Market traders with a special interest in Mordor Stocks (Symbol: M$). Both are watching the same news channel, but one is using a connexion pool (the 1 on the right) while the other is not:

Why use a http client connection pool

Every bit you can see, the trader with the connexion pool leaves the phone off the hook and the banker waiting for more than orders.

When, quite unexpectedly, a one metre humanoid manages to travel 2900 km across several war zones and inhospitable areas, and deliver the just existing nuke to the simply existing weak spot of Sauron, the trader can very apace sell all of his Mordor Stocks, while the trader without the connectedness pool is doomed.

And then if y'all are going to phone call the same server a lot, which is typical for microservices architectures, you will want to avoid the overhead of creating new connections to the server, as it can be quite an expensive operation (from a few millis to hundreds of millis).

This is especially truthful if you are using HTTPS. Run into the TLS handshake.

Connexion pool timeout and TTL

Equally much as connection pools are awesome, as with any other resources, you demand to limit the maximum number of open connections that you desire to maintain, which means that there are three possible scenarios when fetching a connection from the pool.

Side notation: for a very good talk about how to size your connexion pool encounter "End Rate Limiting! Chapters Direction Done Right" by Jon Moore.

Scenario 1. Costless connections.

Assuming a max connection pool of three, the first scenario is:

HTTP Connection pool new connection

So there is some phone available simply on the claw. You will demand to suffer the extra connexion setup delay.

Scenario 2. Connection pooled.

The second scenario:

HTTP Connection pool connection available

There is a phone off the claw, gear up to be used. In this scenario, in that location are some other 2 cases:

The connexion is fresh, created less than the configured TTL. Y'all will NOT need to suffer the extra connexion setup filibuster.
The connection is stale, created more than the configured TTL. Yous will demand to suffer the extra connection setup delay.

Permit's test it:

          ;; Create a new connection pool, with a TTL of 1 second: (def cp (conn-director/make-reusable-conn-manager         {:timeout ane ; in seconds. This is called TimeToLive in PoolingHttpClientConnectionManager          })) ;; Make x calls, two per second: (dotimes [_ 10]   (log/info "Send Http asking")   (client/get "http://local.nginx/" {:connectedness-managing director cp})   (Thread/sleep 500))

Looking at the logs:

          sixteen:56:24.905 INFO  - Send Http request 16:56:24.914 DEBUG - Connection established 172.24.0.four:51984<->172.24.0.2:80 16:56:25.416 INFO  - Send Http request sixteen:56:25.926 INFO  - Transport Http request 16:56:25.933 DEBUG - Connection established 172.24.0.4:51986<->172.24.0.2:80 xvi:56:26.434 INFO  - Send Http request 16:56:26.942 INFO  - Send Http request 16:56:26.950 DEBUG - Connection established 172.24.0.4:51988<->172.24.0.2:80 xvi:56:27.452 INFO  - Send Http request 16:56:27.960 INFO  - Send Http asking 16:56:27.967 DEBUG - Connection established 172.24.0.4:51990<->172.24.0.2:lxxx 16:56:28.468 INFO  - Ship Http request

Every bit expected, nosotros can brand 2 requests before recreating the connection.

Same scenario but with a twenty seconds TTL:

          sixteen:59:19.562 INFO  - Send Http request 16:59:19.570 DEBUG - Connection established 172.24.0.iv:51998<->172.24.0.2:80 xvi:59:xx.073 INFO  - Transport Http asking 16:59:20.580 INFO  - Send Http request 16:59:21.086 INFO  - Transport Http asking 16:59:21.593 INFO  - Send Http request 16:59:22.100 INFO  - Send Http asking 16:59:22.607 INFO  - Send Http request sixteen:59:23.114 INFO  - Send Http asking 16:59:23.623 INFO  - Transport Http request xvi:59:24.134 INFO  - Send Http request

So the same connection is used for all requests.

Merely why exercise nosotros demand the TTL? Mostly because firewalls have this tendency on dropping long live connections (particularly idle ones) without telling whatsoever of the involved parts, which causes the client to take a while to realize that the connectedness is no longer usable.

Scenario 3. All connections in use.

The concluding scenario:

HTTP Connection pool full

All the phones are busy, so you will take to wait. How much you lot are willing to expect for a telephone to get free is the connection pool timeout.

Note that if a telephone becomes available before the connection puddle timeout, you are back to the second scenario. With some unlucky timing, you will also need to establish a new fresh connexion.

Permit's look at an case. First we make the Nginx very slow, taking up to 20 seconds to respond.

And so nosotros create a connection pool with a maximum of 3 connections and nosotros send four HTTP requests:

                      (def cp-3 (conn-director/brand-reusable-conn-manager               {:timeout 100                :threads 3           ;; Max connections in the puddle.                :default-per-route 3 ;; Max connections per route (~ max connection to a server)                }))    (dotimes [_ 4]     (hereafter       (time         (customer/get "http://local.toxiproxy:22220/" {:connection-director cp-3}))))  "Elapsed time: 20017.1325 msecs" "Elapsed time: 20016.9246 msecs" "Elapsed fourth dimension: 20020.9474 msecs" "Elapsed time: 40024.5604 msecs"

As you lot can see, the terminal request takes twoscore seconds, xx of which are spent waiting for a connection to be available.

Calculation a one 2nd connection pool timeout:

          (dotimes [_ 4]   (future     (fourth dimension       (effort         (customer/get "http://local.toxiproxy:22220/"           {:connection-manager cp-iii            :connection-request-timeout 1000 ;; Connection pool timeout in millis            })         (grab Exception e           (log/info (.getClass e) ":" (.getMessage e)))))))  "Elapsed time: 1012.2696 msecs" "2019-12-08 08:59:04.073 INFO  - org.apache.http.conn.ConnectionPoolTimeoutException : Timeout waiting for connection from pool" "Elapsed time: 20014.1366 msecs" "Elapsed time: 20015.3828 msecs" "Elapsed time: 20015.962 msecs"

The thread that is non able to get a connection from the pool gives up after i second, throwing a ConnectionPoolTimeoutException.

Are we done all the same?

Unfortunately, even if connection timeout, read timeout, connectedness puddle timeout and connection pool TTL are the nigh mutual things to tweak, you should also exist aware of:

DNS resolution: it cannot exist explicitly configure information technology in Coffee, arrangement dependant. Skilful to also know how it is cached.
Hosts with multiple IPs: In case of a connection timeout, HTTP client will try to each of them.
TIME_WAIT and SO_LINGER: closing a connection is not immediate and nether very high load it tin can cause issues.

All together!

Putting all the timeouts together, nosotros have:

All Apache HTTP client timeouts

With all these timeouts, it is quite a challenge to know how long a HTTP request is actually going to take, so if you take any SLA or are worried nearly the stability of your awarding, you lot cannot solely rely on setting the timeouts correctly.

If you want to setup just some simple timeout for the whole request, y'all should be using Hystrix Thread Isolation, Apache HTTP Client's FutureRequestExecutionService (never used this one myself) or maybe use a different HTTP client.

Asynchronous HTTP Client

A possible solution to all these timeouts juggling is to apply Asynchronous HTTP Customer, which is based on Netty. You can see here all the above scenarios merely using the Asynchronous HTTP Client.

Some notable differences between both HTTP clients:

Asynchronous HTTP clients take their own thread pool to handle the response once information technology arrives.
No connectedness pool timeout: if the pool is completely used, an exception is thrown. In that location is no waiting for a connection to be available. Interestedly, I ordinarily configure my Apache HTTP connectedness pools to behave the same, as a full connection puddle usually ways that something is not going working and it is better to bail out early.
Connection pool idle timeout: as nosotros mentioned before, we wanted a connexion pool TTL by and large considering idle connections. Asynchronous HTTP Customer comes with an explicit idle timeout, on top of a TTL timeout.
A new request timeout: a timeout to bound the corporeality of fourth dimension it takes to do the DNS lookup, the connection and read the whole response. One single timeout that states how long you lot are willing to await for the whole HTTP conversation to exist done. Sweet.

So the timeouts for the Asynchronous HTTP client expect like:

Asynchronous HTTP client timeouts

You lot tin see again all the same scenarios but using this new request timeout hither, including the Pub Quiz one.

Reasoning about the worst case is a lot easier.

Summary

In summary, timeouts are annoyingly difficult to configure, if you want to have some control over the maximum time allocated for an HTTP request/response. Unless, you are using an Asynchronous HTTP Client (or probably other async clients).

Am I suggesting that you should non use Apache HTTP Customer?

Well, it depends what functionality you are using. Apache HTTP Client is a very mature project with plenty of build-in functionality and hooks to customize it. It fifty-fifty has an async module and the newer 5.0 (beta) version comes with build-in async functionality.

In our example, after this long explanation to my colleague, given our use cases, moving to Asynchronous HTTP Customer was my suggestion.

tonciontert.blogspot.com

Source: https://danlebrero.com/2019/12/11/apache-http-client-timeouts-config-production-asynchronous-http-client-pictures/

Ton Ciontert