Bitbucket Mesh - Sideband channel died

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

Often you could see the Warnings "Sideband channel died" in the Bitbucket logs. Even though this doesn't have any impact to the application, this could cause the application logs to grow in size. 

The sample log snippet that you would see:

2022-09-14 07:11:54,491 WARN  [mesh-grpc-request:thread-21] testuser *SSC569x413x5x0 10.137.142.255,10.137.40.48 "POST /scm/prj/test-repo.git/git-upload-pack HTTP/1.1" c.a.s.i.s.g.m.DefaultMeshSidebandRegistry loadbalancer.meshnode.com (https://loadbalancer.meshnode1.com:443): Sideband channel died
io.grpc.StatusRuntimeException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROR
...
2022-09-14 07:11:54,491 WARN  [mesh-grpc-request:thread-18] buildman *SSC569x413x8x0 10.137.40.74,10.137.40.41 "GET /scm/prj/test-repo2.git/info/refs HTTP/1.1" c.a.s.i.s.g.m.DefaultMeshSidebandRegistry loadbalancer.meshnode1.com (https://loadbalancer.meshnode1.com:443): Sideband channel died
io.grpc.StatusRuntimeException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: PROTOCOL_ERROR

Environment

Bitbucket 8.x with multiple mesh nodes. Each mesh node has a dedicated load balancer in front of them.

Diagnosis

There is no direct impact to the application. Even though you see that the sideband channel disconnected, you can see that they are reconnecting back on their own.

2022-09-16 07:10:14,660 INFO  [grpc-server:thread-18] 79FBEU8Sx407x7x3 10.101.46.222 "ManagementService/Sideband" (>5 <5) c.a.b.m.m.DefaultSidebandManager 10.101.46.222: Sideband disconnected
2022-09-16 07:15:14,665 INFO  [grpc-server:thread-19] 79FBEU8Sx435x62x5 10.101.46.23 "ManagementService/Sideband" (>0 <0) c.a.b.m.m.DefaultSidebandManager 10.101.46.23: Sideband connected

Cause


Each load balancer has it's own idle timeout value. For example, Amazon's ALB has an idle_timeout value of 300 seconds. Once this threadshold has reached, the load balancer disconnects the node.

Having a load balancer in front of each mesh node is not something Bitbucket mandates. Bitbucket does not want a load balancer to hand out the routing of requests. The mesh control plane within Bitbucket is capable of doing this. However, we have seen customers leveraging dedicated load balancers because of multiple reasons such as dynamic IPs, SSL certificates are managed by AWS ACM, etc.

So the solution works flawlessly in such cases. 

Solution

The first and the most obvious solution would be to disable the idle_timeout value from the load balancer side or substantially increase it (The maximum idle_timeout supported by ALB is 4000 seconds).

From Bitbucket side, we have an internal tunable property which is a heartbeat value that has a default interval of 5 minutes. You could try setting this in their bitbucket.properties file:

plugin.bitbucket-git.mesh.grpc.client.healthcheck-interval=30s

This would need a restart of the bitbucket instance to take effect.

Last modified on Oct 12, 2022

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.