Disk space hotspots and cleanup best practices in Bamboo
Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Except Fisheye and Crucible
Summary
Please follow this guide to find out which are the most space-consuming areas in your Bamboo system.
This guide will help you with:
- Determining which files are contributing to the space and see if there is an unusual behaviour or if it just increased in normal usage.
- Best practice on cleanups and what can be done to reduce the size or lower the increase moving forward
- Tips on narrowing down on which plans are using the most space as well as best practices on build expiry, branch expiry and how to find plans that are overriding global expiry.
Diagnosis
Once you detect atypical use of disk space on your Bamboo server, it is advised to validate with your development team if there is anything unusual going on with the development of applications that may be generating excessive builds. That may be related to a specific task, a sprint, a spike or something urgent within your company. That phenomenon could generate multiple builds that may be putting an extra load on your CI/CD infrastructure, leading to this unexpected growth.
Filesystem usage
Firstly, list the directories and their sizes. This strategy will tell you the occupied space on a per-plan basis.
Artifacts
Find large artifacts
This command will list the disk space used by each artifact within a plan. Please check the next session on how to match an artifact with its respective build.
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/shared/artifacts/ --max-depth 1 | sort -rh
60K /var/atlassian/application-data/bamboo/shared/artifacts
32K /var/atlassian/application-data/bamboo/shared/artifacts/plan-15106049
12K /var/atlassian/application-data/bamboo/shared/artifacts/globalStorage
4.0K /var/atlassian/application-data/bamboo/shared/artifacts/plan-6062084
0 /var/atlassian/application-data/bamboo/shared/artifacts/tmp
0 /var/atlassian/application-data/bamboo/shared/artifacts/plan-1638401
(...)
More information: How do I know what Bamboo plan is storing artifacts in which directory on disk?
Run a SQL query to link each artifact with its respective builds
This query will list all artifacts that are still valid, along with their sizes, Project/Plan, Location and Build dates. This is really useful if you are looking to remove a specific Build that is occupying a lot of space. The query below works on PostgreSQL.
SELECT build.build_type,
build.full_key,
artifact.label,
artifact.build_number,
artifact.artifact_id,
artifact.chain_artifact AS SHARED,
artifact.artifact_size AS SIZE,
Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location,
buildresultsummary.build_date
FROM build
JOIN artifact
ON build.full_key = artifact.plan_key
JOIN brs_artifact_link
ON artifact.artifact_id = brs_artifact_link.artifact_id
JOIN buildresultsummary
ON buildresultsummary.buildresultsummary_id =
brs_artifact_link.producerjobresult_id
WHERE globally_stored = false
AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
artifact.link_type LIKE '%ServerLocalArtifactHandler' )
ORDER BY plan_key,
build_number,
shared;
build_type | full_key | label | build_number | artifact_id | shared | size | location | build_date
--------------+----------+---------------+--------------+-------------+--------+------+----------------------------------------------+-------------------------
CHAIN_BRANCH | ABC-AG3 | uptime.txt | 1 | 66289669 | t | 8 | <bamboo-home>/shared/artifacts/plan-65929221 | 2023-03-01 21:04:35.372
CHAIN_BRANCH | ABC-AG3 | uptime.txt | 2 | 66617345 | t | 8 | <bamboo-home>/shared/artifacts/plan-65929221 | 2023-03-04 22:38:24.094
CHAIN | ABC-GH | gh.txt | 26 | 69730306 | t | 11 | <bamboo-home>/shared/artifacts/plan-47087617 | 2023-04-03 16:10:25.523
CHAIN | ABC-GH | gh2.txt | 26 | 69730308 | t | 11 | <bamboo-home>/shared/artifacts/plan-47087617 | 2023-04-03 16:10:25.523
CHAIN_BRANCH | ABC-GH0 | gh.txt | 5 | 65470466 | t | 11 | <bamboo-home>/shared/artifacts/plan-64978947 | 2023-02-28 17:01:06.412
CHAIN | ABC-MF1 | date.txt | 32 | 60358676 | t | 30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-01-23 07:49:01.725
CHAIN | ABC-MF1 | date.txt | 33 | 62095363 | t | 30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:34:31.229
CHAIN | ABC-MF1 | date.txt | 34 | 62095364 | t | 30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:42:27.856
CHAIN | ABC-MF1 | date.txt | 35 | 62095365 | t | 30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:43:32.956
- Storage_tag value will be NULL for build_type JOB, which represents the jobs configured under the plan or plan branches.
- The build_type for plans and plan branches will be CHAIN and CHAIN_BRANCH, respectively, and these will have the storage_tag which represents the key for that particular plan or plan branch.
- The artifacts will be stored in the plan-key folder itself.
- You can filter out the build_type JOB in the query by adding a condition like build.build_type !="JOB"
Find cumulative values for each Default branch and any Plan-branches
SELECT build.build_type,
build.full_key,
artifact.chain_artifact AS SHARED,
Sum(artifact.artifact_size) AS SIZE,
Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location
FROM build
JOIN artifact
ON build.full_key = artifact.plan_key
JOIN brs_artifact_link
ON artifact.artifact_id = brs_artifact_link.artifact_id
JOIN buildresultsummary
ON buildresultsummary.buildresultsummary_id =
brs_artifact_link.producerjobresult_id
WHERE globally_stored = false
AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
artifact.link_type LIKE '%ServerLocalArtifactHandler' )
GROUP BY artifact.chain_artifact,
build.build_type,
build.full_key,
build.storage_tag,
artifact.plan_key
ORDER BY full_key,
size DESC,
build_type,
full_key;
build_type | full_key | shared | size | location
--------------+----------+--------+----------+---------------------------------------------
CHAIN | BAM-TOM | t | 18424 | <bamboo-home>/shared/artifacts/plan-15106049
CHAIN | MSP-BA | t | 264 | <bamboo-home>/shared/artifacts/plan-6062084
CHAIN_BRANCH | MSP-BA4 | t | 5242946 | <bamboo-home>/shared/artifacts/plan-20054025
CHAIN_BRANCH | MSP-BA5 | t | 10485826 | <bamboo-home>/shared/artifacts/plan-20054026
CHAIN_BRANCH | MSP-BA6 | t | 31457346 | <bamboo-home>/shared/artifacts/plan-20054027
CHAIN_BRANCH | MSP-BA7 | t | 26214466 | <bamboo-home>/shared/artifacts/plan-20054028
(6 rows)
If you need specific totals for the Default plans or Plan branches report
SELECT build.build_type,
build.full_key,
artifact.chain_artifact AS SHARED,
Sum(artifact.artifact_size) AS SIZE,
Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location
FROM build
JOIN artifact
ON build.full_key = artifact.plan_key
JOIN brs_artifact_link
ON artifact.artifact_id = brs_artifact_link.artifact_id
JOIN buildresultsummary
ON buildresultsummary.buildresultsummary_id =
brs_artifact_link.producerjobresult_id
WHERE globally_stored = false
AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
artifact.link_type LIKE '%ServerLocalArtifactHandler' )
AND build.master_id IS NULL
GROUP BY artifact.chain_artifact,
build.build_type,
build.full_key,
build.storage_tag,
artifact.plan_key
ORDER BY full_key,
size DESC,
build_type,
full_key;
build_type | full_key | shared | size | location
------------+----------+--------+-------+---------------------------------------------
CHAIN | BAM-TOM | t | 18424 | <bamboo-home>/shared/artifacts/plan-15106049
CHAIN | MSP-BA | t | 264 | <bamboo-home>/shared/artifacts/plan-6062084
(2 rows)
SELECT BM.build_type,
BM.full_key,
artifact.chain_artifact AS SHARED,
Sum(artifact.artifact_size) AS SIZE
FROM build
JOIN build BM
ON build.master_id = BM.build_id
JOIN artifact
ON build.full_key = artifact.plan_key
JOIN brs_artifact_link
ON artifact.artifact_id = brs_artifact_link.artifact_id
JOIN buildresultsummary
ON buildresultsummary.buildresultsummary_id =
brs_artifact_link.producerjobresult_id
WHERE globally_stored = false
AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
artifact.link_type LIKE '%ServerLocalArtifactHandler')
GROUP BY artifact.chain_artifact,
BM.build_type,
BM.full_key,
BM.storage_tag
ORDER BY full_key,
size DESC,
build_type,
full_key;
build_type | full_key | shared | size
------------+----------+--------+----------
CHAIN | MSP-BA | t | 73400584
(1 row)
Check for artifacts in Global Storage
Globally stored artifacts will be located in a folder within <bamboo-home>/shared/artifacts/globalStorage
. Artifacts on this location had their build results expired (I.e. cleaned up, removed) and are there if there is still a deployment plan referencing it. This way if deployments of older versions will not fail if actioned.
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/shared/artifacts/globalStorage --max-depth 1 | sort -rh
12K /var/atlassian/application-data/shared/artifacts/globalStorage
4.0K /var/atlassian/application-data/shared/artifacts/globalStorage/8617988
4.0K /var/atlassian/application-data/shared/artifacts/globalStorage/6520840
4.0K /var/atlassian/application-data/shared/artifacts/globalStorage/6520836
Check the following documentation for specific queries on how to identify and locate Global Storage artifacts:
Build results and logs
Depending on your workload, build jobs may end up generating tons of logs that may be impacting your disk space threshold. That means that even if your build plan is not using any significant disk space on artifacts, it may be using it on its logs.
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# find ${BAMBOO_HOME}/shared/builds/*/download-data/build_logs -maxdepth 1 ! -name "." -type d -print0 | xargs -0 -n1000 du -h | sort -rh
704K /var/atlassian/application-data/bamboo/shared/builds/plan-11927555-JOB1/download-data/build_logs
68K /var/atlassian/application-data/bamboo/shared/builds/15335425-14712834/download-data/build_logs
32K /var/atlassian/application-data/bamboo/shared/builds/plan-6062084-JOB1/download-data/build_logs
20K /var/atlassian/application-data/bamboo/shared/builds/plan-15106049-JOB1/download-data/build_logs
16K /var/atlassian/application-data/bamboo/shared/builds/plan-688129-JOB1/download-data/build_logs
16K /var/atlassian/application-data/bamboo/shared/builds/plan-1638401-JOB1/download-data/build_logs
16K /var/atlassian/application-data/bamboo/shared/builds/8388609-8552449/download-data/build_logs
8.0K /var/atlassian/application-data/bamboo/shared/builds/plan-11927557-JOB1/download-data/build_logs
8.0K /var/atlassian/application-data/bamboo/shared/builds/plan-10649603-JOB1/download-data/build_logs
8.0K /var/atlassian/application-data/bamboo/shared/builds/6225921-6389761/download-data/build_logs
4.0K /var/atlassian/application-data/bamboo/shared/builds/plan-11927555/download-data/build_logs
4.0K /var/atlassian/application-data/bamboo/shared/builds/plan-10649603/download-data/build_logs
4.0K /var/atlassian/application-data/bamboo/shared/builds/plan-10649601-RUN/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-6062084/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-1638401-JSJ/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-1638401-GPFV/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-1638401/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-1638401-CHEC/download-data/build_logs
0 /var/atlassian/application-data/bamboo/shared/builds/plan-1638401-BUIL/download-data/build_logs
Once you understand which Plans are good candidates for a cleanup, you can adjust individual Plan expiry for that specific plan or even modify your Global expiry settings to be more aggressive.
More information about important Bamboo directories:
Count build results
The build results volume that is stored in the Bamboo database has a very important impact on system performance.
Run this SQL statement to find the number of Build results per Plan. You can then use these numbers to plan a more aggressive individual Plan expiry.
SELECT CB.full_key,
Count(DISTINCT CSR.chainresult_id) AS chainresults
FROM build B
JOIN buildresultsummary BRS
ON B.full_key = BRS.build_key
JOIN chain_stage_result CSR
ON BRS.chain_result = CSR.chainresult_id
JOIN chain_stage CS
ON B.stage_id = CS.stage_id
JOIN build CB
ON CS.build_id = CB.build_id
GROUP BY CB.full_key
ORDER BY Count(DISTINCT CSR.chainresult_id) DESC;
full_key | chainresults
-------------+--------------
MY-PROJ1 | 201
BAM-TOM | 113
LARD-MOM | 102
MSP-BA | 83
MFP-MVFP | 62
BAM-FOO | 52
BAM-BOO | 44
DRA-MAIN | 20
PRJ-PLANKEY | 12
(9 rows)
Look for Plan Branches
Plan branches are linked to a repository and will be generated once a new Branch is created.
It is important to understand if those Plan branches can be cleaned up after use.
SELECT build_id,
build_type,
created_date,
updated_date,
full_key
FROM build
WHERE marked_for_deletion IS NOT NULL
AND build_type = 'CHAIN_BRANCH'
ORDER BY updated_date ASC;
Plan branches without any active expiry
Plan branches need to have their expiry settings explicitly set. So if the plan branch expiry is not configured, even if the Branch is deleted from the remote repository, the Plan branch will still stay around and you will have to delete it manually if expiry is not set.
The following SQL statement helps you to locate the Plan branches without any expiry settings.
SELECT build.full_key,
build_definition.build_id,
build.created_date,
build.updated_date
FROM build
JOIN build_definition
ON build.build_id = build_definition.build_id
WHERE build.marked_for_deletion IS NOT NULL
AND build_definition.xml_definition_data LIKE
'%<branchRemovalCleanUpEnabled>false</branchRemovalCleanUpEnabled>%'
AND build_type = 'CHAIN'
ORDER BY updated_date ASC;
Solution
Once you have investigated what Plans are the top consumers you can just straight and start deleting them one by one or you can act preventatively and program Global and Plan-based expiry settings.
Global expiry
By using Global expiry, it is important to understand a few technical aspects of it.
When the data is erased
Once you configure the global expiry criteria (and their exceptions) and activate them, the cleanup process will only start:
- Manually: Click "Run now" over the "Expiry" menu in Settings (manual method)
- Scheduled: When the "next scheduled run" under the Removal schedule is reached
If you are concerned that the cleanup might start immediately, you can set a long date in the future under the Removal schedule just in case.
What data is erased
You have the following choices in terms of data that will be cleaned up:
- Complete build & deployment results, build & release artifacts and all logs (Excludes historical deployment records)
- Build and release artifacts only
- Build and deployment result logs only
Bamboo expiry will clean up the matched builds logs and resulting files located within your <bamboo-home>/shared/builds
folder.
Please note, the expiry will not clean the working directory for Agents and plans within <bamboo-agent-home>/xml-data/build-dir
. That must be managed by the job itself with a Clean working directory task or Plan Configuration >> Job >> Other >> Clean working directory after each build
Check Locating important directories and files to understand how Bamboo stores each type of data.
Cleaning criteria
You can configure the global expiry retention criteria. Bamboo will keep the results as long as they meet the configured criteria.
Individual plan expiry, branch expiry
You can also have per-plan expiry rules that will override the global expiry settings that affect all plans in Bamboo. If you disable build expiry for a plan, that plan's build result data will never be automatically deleted from your Bamboo server. You can select the build result data that will be kept for a plan and for how long this data will be kept (e.g. for reporting purposes) before Bamboo automatically deletes it.
It is also important to understand that your top "Plan" results need to be combined with your "Plan branches" results. If you have lots of branches coming from your linked repositories, you may have to check for their sizes and consider them as a whole. Every artifact generated in a Default/master branch will also be contained in its Plan-branches.
More information here:
You can also delete individual build results for a plan as a one-off, manually. You can use this method if you want to have control over past build results that you wish to delete.
Plan branches expiry
You can also enable Plan-branches expiry, which will delete branches from Bamboo once they are removed from your repository. This will make Bamboo leaner in terms of the number of branches you have. Follow the path below for each Plan you want to be activated:
- Plan Configuration -> Branches -> Delete plan branch (choose criteria)
Configuring a Plan branch cleanup:
Override global plan expiry report
To get a list of Plans that are overriding your Global expiry settings you can simply go to:
- Bamboo Administration -> Plans -> Expiry -> See plans with custom expiry settings (under Expiry overrides)
Alternatively, if using the Bamboo UI is too slow or you have too many results to analyse, you can use a REST API call or SQL SELECT statement for that:
$ curl -vvv --user $BAMBOO_ADMIN:$PASSWORD -k https://bamboo.mydomain.net/rest/api/latest/admin/expiry/custom/plan | jq
>
{
"self": "https://bamboo.mydomain.net/rest/api/latest/admin/expiry/custom/plan?start=0&limit=25",
"start": 0,
"limit": 25,
"results": [
{
"planName": "BAM - BOO",
"planKey": "BAM-BOO",
"configLink": {
"href": "https://bamboo.mydomain.net/chain/admin/config/editChainMiscellaneous.action?buildKey=BAM-BOO",
"rel": "edit"
},
"expiryConfig": {
"expiryTypeNothing": false,
"expiryTypeResult": true,
"expiryTypeArtifact": true,
"expiryBuildLog": true,
"duration": 0,
"period": "days",
"labelsList": "dontexpire",
"buildsToKeep": 0,
"maximumBuildsToKeep": 3
}
},
SELECT b.full_key ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE 'no'
end AS is_overwriting_expiry ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE 'no'
end AS do_not_expire_anything ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE 'no'
end AS is_expiring_result ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeBuildLog/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE 'no'
end
end AS is_expiring_build_log ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeArtifact/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
ELSE 'no'
end
end AS is_expiring_artifacts ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
end AS expire_after_days ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
end AS minimum_builds_to_keep ,
CASE
WHEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
end AS labels_to_keep
FROM build_definition bd
JOIN build b
ON (
bd.build_id = b.build_id)
WHERE b.build_type = 'CHAIN'
AND cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true'
OR cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'false';
Please note, this may not work due to invalid XML that can be stored as a result of:
If it doesn't we can use this but it does not lay out the data as cleanly for presentation:
SELECT B.full_key,
BD.*
FROM build_definition BD
JOIN build B
ON BD.build_id = B.build_id
WHERE B.build_type = 'CHAIN'
AND BD.xml_definition_data LIKE
'%<buildExpiryConfig>%<enabled>true</enabled>%</buildExpiryConfig>%';
Cleaning up the temporary location:
Please make sure you are not deleting the <Bamboo-install>/temp folder, instead, delete only the files stored in that location. Deleting the entire folder will affect the Bamboo service and cause downtime.
The temporary file location is used by the JVM as a temporary storage area. Its location is declared on the JVM by the java.io.tmpdir property in the catalina.sh
and catalina.bat
scripts. Tomcat is configured to use this temporary location rather than its default for security reasons. The temp directory must exist for Tomcat to work correctly. To avoid unexpected events, it is recommended to stop Bamboo before removing the files from that location.