Bamboo Data Center NodeAliveWatchdog shuts down Bamboo during DB scheduled backups
Platform Notice: Data Center Only - This article only applies to Atlassian products on the Data Center platform.
Note that this KB was created for the Data Center version of the product. Data Center KBs for non-Data-Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Bamboo Data Center shuts down with a message in atlassian-bamboo.log:
2023-03-23 06:17:46,556 ERROR [scheduler_Worker-6] [NodeAliveWatchdog] Current node failed to refresh its state in DB within last 3 minutes. This node will now go down
Environment
Bamboo Data Center 8.0 and above
Diagnosis
atlassian-bamboo.log file contains a message:
2023-03-23 06:17:46,556 ERROR [scheduler_Worker-6] [NodeAliveWatchdog] Current node failed to refresh its state in DB within last 3 minutes. This node will now go down
Cause
The Bamboo NodeAliveWatchdog monitors the database for read and write ability, if the Database is unavailable or read-only for more than 3 minutes the node will shutdown to allow the cold-standby node if one is available to take-over.
Solution
If your database is anticipated to be unavailable for more than 3 minutes you can increase or disable the NodeAliveWatchdog timeout by adding a Bamboo System Property. A value of 0 disables the check, a number greater than 0 will be the number of minutes. For example the snippet below will set the timeout to 5 minutes.
-Dbamboo.node.alive.watchdog.timeout=5