Bitbucket Data Center Performance
Testing results summary
Better user management - Bitbucket Data Center is able to service more users concurrently as each node is added to the cluster.
Better Git performance - Git hosting operations scale especially well as each node is added to the cluster.
Request responses don't diminish under increased load - Bitbucket Data Center does not service customer requests any quicker (response time) than a single Bitbucket Server instance, but remains responsive as the load increases.
Overall performance gains - The largest contributors to the overall performance of Bitbucket Data Center are the computing power, IO bandwidth and network connectivity available.
Bitbucket Data Center deployment with more nodes is capable of servicing more requests as the number of requests increases.
Testing methodology and specifications
The following sections describe the testing environment and methodology we use for our performance tests.
How we tested
To ensure that our measurements are accurate and reproducible, all testing is conducted on bare metal. Our test lab is completely isolated from outside interference and before each test is executed the lab servers are sanitized and reinitialized using a purpose-built test framework.
To generate enough load to exercise a multi-node cluster, we use a number of load generator servers each containing 24 CPUs and 48 GB of RAM. Additionally, these loader servers are equipped with SSD hard drives to remove IO restrictions. During a test run all system telemetry is collected and dispatched to a central server where it is collated and attached to the run that produced them. The system telemetry is analyzed using R and Python scripts. From this information we can see exactly what each part of the system was doing at any given point in time. Java Virtual Machine, Tomcat and Bitbucket statistics are collected using JMX.
Our Bitbucket Data Center deployment is built from the following components:
- Bitbucket Data Center cluster nodes. A heterogeneous collection of servers that can be grown and reduced as needed.
- A Bitbucket-supported RDBMS. We use Postgres SQL Server.
- A mass storage system with NFS support. We use a Linux based NFSv3 server.
- A load balancer, which distributes incoming requests across the processing nodes. We use HAProxy, but also test with Apache mod_proxy.
What we tested
We concentrated on testing functional areas that are used on a daily basis and have a high impact on the overall throughput of the system . These are:
- Git operations (clone, push, pull, fetch)
- Authentication
Test scripts are tuned to emulate a desired usage profile by adjusting the following criteria:
- The number of users created and roles assigned (administration, reviewer, contributor).
- The number of repositories and projects created.
- The number of users assigned to each repository.
- The number of pull requests to create.
- The number of files to be edited, deleted or renamed.
- The number of reviewers.
- The number of contributors.
Before a test is executed the cluster is seeded with a number of Git repositories of various sizes.
Hardware
All performance tests are run on the same controlled, isolated lab at Atlassian using the hardware listed below.
Hardware | Description | How many? |
---|---|---|
Rackform iServ R304.v3 | CPU: 2 x Intel Xeon E5-2430L, 2.0GHz (6-Core, HT, 15MB Cache, 60W) 32nm | 20 |
Arista DCS-7050 | 4PORT SFP+ REAR-TO-FRONT AIR 2XAC | 1 |
HP ProCurve Switch | 1810-48G 48 Port 10/100/1000 ports Web Managed Switch | 1 |
Usage profile
The typical usage profile we use is made up of the operation ratios shown in the pie chart below. The load is evenly spaced over typical web interactions (login, landing page, project and repository viewing and source browsing), pull requests (code review and collaboration) and hosting operations. We think these proportions are representative of the core workflow of a software development team.
Usage profile is representative of the typical workflow of a software development team.
Conclusion
Bitbucket Data Center is an exciting new platform, raising the bar for high concurrency and high availability Git hosting. We have seen major benefits from deploying Bitbucket Data Center in our own organization. Bitbucket remains responsive even at peak usage and the effects of aggressive CI server load no longer impact on the responsiveness of the Bitbucket user interface.