HTTP 504 gateway timeout error while generating support.zip from remote Mesh nodes with AWS ALB and Bitbucket Data Center
Platform Notice: Data Center - This article applies to Atlassian products on the Data Center platform.
Note that this knowledge base article was created for the Data Center version of the product. Data Center knowledge base articles for non-Data Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
While following instructions given on the page How to generate and gather support zips from remote mesh nodes, HTTP connection fails with "HTTP 504" gateway timeout error and support.zip is not downloaded. Remote Mesh node works on AWS cluster and there is an AWS ALB load balancer in front of it.
Environment
The solution has been validated in Bitbucket 8.19.2 but may be applicable to other versions.
This only affects Bitbucket running with remote Mesh nodes and using AWS ALB load balancer, but is applicable to other load balancers and reverse proxies.
Diagnosis
To collect support.zip from remote Mesh nodes one has to launch an REST API call to them. The easiest way to do it is to use a command-line tools like curl
or wget
. When those commands are started with --verbose
argument, they will output HTTP connection status information.
In cases when there is a load balancer in front of the Mesh nodes, sometimes the HTTP connection between the curl
or wget
command-line client and the remote Mesh node fails with HTTP 504 error, and support.zip is not downloaded.
As an additional test, we can check how long does it take for connection to be dropped with the "HTTP 504" error:
Command sequence like the following can help; the
COMMAND_YOU_USED_TO_GENERATE_SUPPORT_ZIPS
below denotes actualcurl
orwget
command with all arguments:date; COMMAND_YOU_USED_TO_GENERATE_SUPPORT_ZIPS; date
- Note the period between command start and end.
If it is similar to some round number of seconds - 30, 60, 180, 300, 600, etc - this strongly suggests there may be some timeout involved.
Cause
The reason why an HTTP 504 gateway timeout error happens when using the reverse proxy or load balancer in front of the remote Mesh nodes is duration of generating support.zip; it can take several minutes, and during that time there is no data transfer between remote Mesh node and the load balancer.
If load balancer is configured with "idle disconnect timeout" parameter, it will drop the connection after certain period passes without data being transferred between remote Mesh node and itself.
Solution
To solve this problem, we have to reconfigure load balancer and set different idle disconnect timeout.
Various load balancers have different parameter names and default settings.
Following information is related to the case of AWS ALB load balancer's idle connection timeout settings:
- The page Edit attributes for your Application Load Balancer - Connection idle timeout describes the procedure on how to change the "connection idle" timeout.
- The page Application Load Balancers - Load balancer attributes has a list of configurable attributes.
The one responsible for idle connection closing is idle_timeout.timeout_seconds: The idle timeout value, in seconds. The default is 60 seconds.
Since support.zip generation can take significant time, we suggest setting idle disconnect timeout to a larger value; you may need to tune it in a few iterations.
Starting with 10 or 15 minutes sounds reasonable.