Project

General

Profile

Handling robots that lack websocket support and throttle timeouts

Added by Bruce Toll over 2 years ago

We have an application that is configured to use websockets. It receives connections via an nginx server acting as a reverse proxy. The nginx server has a 60 second timeout on proxy operations.

On occasion, nginx logs upstream timeouts for a proxy connection to our Wt app. These often occur on sessions that appear to originate from a Chrome browser with a reported resolution of 1366x768. The affected sessions do not successfully upgrade to a websocket connection, but instead maintain an outstanding POST request (for long-polling). While these POST requests would normally get canceled/retried after 50 seconds by the client, these sessions seem to have an additional issue that throttles Javascript timers. As a result, the Wt client does not close the connection before the nginx 60 second timeout expires so nginx logs an error and replies to the POST request with a 504 status "upstream timed out".

One example where we see this behavior involves gmail. It seems that when a link to a Wt app is opened in gmail, the Wt app may get visited by a robot "browser" with these characteristics some time later.

I did some testing with nginx configured for a longer proxy timeout settings (20 minutes) to observe what would happen with this specific robot/browser and it seems that the first long poll lasts around 90 seconds before getting retried by the Wt client. A second long poll appears to get disconnected by the browser-like client after 180 seconds; nginx logs a 499 status.

With either proxy timeout setting, the Wt session gets successfully cleaned-up after the session timeout expires (10 minutes). So, this behavior seems mostly harmless. But, I'm concerned that ignoring upstream timeouts might mask a real issue with the upstream Wt application where it might actually take more than a minute to respond to a user request.

This topic is somewhat related to #8136.

It might be helpful if Wt maintained a server-side watchdog timer to deal with defective client timer support. This could help avoid proxy timeouts, while providing a more meaningful error message in the Wt log. If these robot/browsers become more prevalent, it might also be desirable to identify and terminate these sessions more quickly to conserve resources.

Although I haven't done any testing, it may be possible to accomplish some of these goals without additional Wt support:

  1. Set a singleShot WTimer for 10 seconds at application start-up.
  2. Set a server-side ASIO steady_timer as a watchdog timer with a longer timeout, e.g. 20 seconds, at application start-up.
  3. If the watchdog timer expires first, cancel the WTimer and log that the client's timer support seems broken. Optionally, quit() the session (if unexpected, e.g. not a mobile browser).

These are just preliminary notes. Please feel free to follow-up with related experiences and any additional thoughts on handling.


Replies (1)

RE: Handling robots that lack websocket support and throttle timeouts - Added by Bruce Toll over 2 years ago

NOTE: The comments in the previous message relate to Wt 4.5.0.

    (1-1/1)