Disk space hotspots and cleanup best practices in Bamboo

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform notice: Server and Data Center only. This article only applies to Atlassian products on the Server and Data Center platforms.

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

 

Summary

Please follow this guide to find out which are the most space-consuming areas in your Bamboo system.

This guide will help you with:

  • Determining which files are contributing to the space and see if there is an unusual behaviour or if it just increased in normal usage.
  • Best practice on cleanups and what can be done to reduce the size or lower the increase moving forward
  • Tips on narrowing down on which plans are using the most space as well as best practices on build expiry, branch expiry and how to find plans that are overriding global expiry.

Diagnosis

Once you detect atypical use of disk space on your Bamboo server, it is advised to validate with your development team if there is anything unusual going on with the development of applications that may be generating excessive builds. That may be related to a specific task, a sprint, a spike or something urgent within your company. That phenomenon could generate multiple builds that may be putting an extra load on your CI/CD infrastructure, leading to this unexpected growth.

Filesystem usage

Firstly, list the directories and their sizes. This strategy will tell you the occupied space on a per-plan basis.

Artifacts

Find large artifacts

This command will list the disk space used by each artifact within a plan. Please check the next session on how to match an artifact with its respective build.

Find large artifacts
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/shared/artifacts/ --max-depth 1 | sort -rh
60K	/var/atlassian/application-data/bamboo/shared/artifacts
32K	/var/atlassian/application-data/bamboo/shared/artifacts/plan-15106049
12K	/var/atlassian/application-data/bamboo/shared/artifacts/globalStorage
4.0K	/var/atlassian/application-data/bamboo/shared/artifacts/plan-6062084
0	/var/atlassian/application-data/bamboo/shared/artifacts/tmp
0	/var/atlassian/application-data/bamboo/shared/artifacts/plan-1638401
(...)

More information:  How do I know what Bamboo plan is storing artifacts in which directory on disk?


Run a SQL query to link each artifact with its respective builds

This query will list all artifacts that are still valid, along with their sizes, Project/Plan, Location and Build dates. This is really useful if you are looking to remove a specific Build that is occupying a lot of space. The query below works on PostgreSQL.

Link each artifact with their respective builds
SELECT build.build_type,
       build.full_key,
	   artifact.label,
       artifact.build_number,
	   artifact.artifact_id,
       artifact.chain_artifact                             AS SHARED,
       artifact.artifact_size                              AS SIZE,
       Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location,
       buildresultsummary.build_date
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
             artifact.link_type LIKE '%ServerLocalArtifactHandler' )     
ORDER  BY plan_key,
          build_number,
          shared;
    
 build_type   | full_key |     label     | build_number | artifact_id | shared | size |                   location                   |       build_date        
--------------+----------+---------------+--------------+-------------+--------+------+----------------------------------------------+-------------------------
 CHAIN_BRANCH | ABC-AG3  | uptime.txt    |            1 |    66289669 | t      |    8 | <bamboo-home>/shared/artifacts/plan-65929221 | 2023-03-01 21:04:35.372
 CHAIN_BRANCH | ABC-AG3  | uptime.txt    |            2 |    66617345 | t      |    8 | <bamboo-home>/shared/artifacts/plan-65929221 | 2023-03-04 22:38:24.094
 CHAIN        | ABC-GH   | gh.txt        |           26 |    69730306 | t      |   11 | <bamboo-home>/shared/artifacts/plan-47087617 | 2023-04-03 16:10:25.523
 CHAIN        | ABC-GH   | gh2.txt       |           26 |    69730308 | t      |   11 | <bamboo-home>/shared/artifacts/plan-47087617 | 2023-04-03 16:10:25.523
 CHAIN_BRANCH | ABC-GH0  | gh.txt        |            5 |    65470466 | t      |   11 | <bamboo-home>/shared/artifacts/plan-64978947 | 2023-02-28 17:01:06.412
 CHAIN        | ABC-MF1  | date.txt      |           32 |    60358676 | t      |   30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-01-23 07:49:01.725
 CHAIN        | ABC-MF1  | date.txt      |           33 |    62095363 | t      |   30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:34:31.229
 CHAIN        | ABC-MF1  | date.txt      |           34 |    62095364 | t      |   30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:42:27.856
 CHAIN        | ABC-MF1  | date.txt      |           35 |    62095365 | t      |   30 | <bamboo-home>/shared/artifacts/plan-57999361 | 2023-02-09 10:43:32.956
  • Storage_tag value will be NULL for build_type JOB, which represents the jobs configured under the plan or plan branches.
  • The build_type for plans and plan branches will be CHAIN and CHAIN_BRANCH, respectively, and these will have the storage_tag which represents the key for that particular plan or plan branch.
  • The artifacts will be stored in the plan-key folder itself.
  • You can filter out the build_type JOB in the query by adding a condition like build.build_type !="JOB"

Find cumulative values for each Default branch and any Plan-branches

Find cumulative values for each Default branch and any Plan-branches
SELECT build.build_type,
       build.full_key,
       artifact.chain_artifact                             AS SHARED,
       Sum(artifact.artifact_size)                         AS SIZE,
       Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
             artifact.link_type LIKE '%ServerLocalArtifactHandler' )
GROUP  BY artifact.chain_artifact,
          build.build_type,
          build.full_key,
          build.storage_tag,
          artifact.plan_key
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key; 
  build_type  | full_key | shared |   size   |                  location                  
--------------+----------+--------+----------+---------------------------------------------
 CHAIN        | BAM-TOM  | t      |    18424 | <bamboo-home>/shared/artifacts/plan-15106049
 CHAIN        | MSP-BA   | t      |      264 | <bamboo-home>/shared/artifacts/plan-6062084
 CHAIN_BRANCH | MSP-BA4  | t      |  5242946 | <bamboo-home>/shared/artifacts/plan-20054025
 CHAIN_BRANCH | MSP-BA5  | t      | 10485826 | <bamboo-home>/shared/artifacts/plan-20054026
 CHAIN_BRANCH | MSP-BA6  | t      | 31457346 | <bamboo-home>/shared/artifacts/plan-20054027
 CHAIN_BRANCH | MSP-BA7  | t      | 26214466 | <bamboo-home>/shared/artifacts/plan-20054028
(6 rows)

If you need specific totals for the Default plans or Plan branches report

Totals for Default plan builds
SELECT build.build_type,
       build.full_key,
       artifact.chain_artifact                             AS SHARED,
       Sum(artifact.artifact_size)                         AS SIZE,
       Concat('<bamboo-home>', '/shared/artifacts/', storage_tag) AS Location
FROM   build
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
             artifact.link_type LIKE '%ServerLocalArtifactHandler' )
	   AND build.master_id IS NULL
GROUP  BY artifact.chain_artifact,
          build.build_type,
          build.full_key,
          build.storage_tag,
          artifact.plan_key
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key;
 build_type | full_key | shared | size  |                   location                  
------------+----------+--------+-------+---------------------------------------------
 CHAIN      | BAM-TOM  | t      | 18424 | <bamboo-home>/shared/artifacts/plan-15106049
 CHAIN      | MSP-BA   | t      |   264 | <bamboo-home>/shared/artifacts/plan-6062084
(2 rows)
Totals for Branch plan builds
SELECT BM.build_type,
       BM.full_key,
       artifact.chain_artifact     AS SHARED,
       Sum(artifact.artifact_size) AS SIZE
FROM   build
       JOIN build BM
         ON build.master_id = BM.build_id
       JOIN artifact
         ON build.full_key = artifact.plan_key
       JOIN brs_artifact_link
         ON artifact.artifact_id = brs_artifact_link.artifact_id
       JOIN buildresultsummary
         ON buildresultsummary.buildresultsummary_id =
            brs_artifact_link.producerjobresult_id
WHERE  globally_stored = false
       AND ( artifact.link_type LIKE '%BambooRemoteArtifactHandler' OR
             artifact.link_type LIKE '%ServerLocalArtifactHandler')
GROUP  BY artifact.chain_artifact,
          BM.build_type,
          BM.full_key,
          BM.storage_tag
ORDER  BY full_key,
          size DESC,
          build_type,
          full_key; 
 build_type | full_key | shared |   size   
------------+----------+--------+----------
 CHAIN      | MSP-BA   | t      | 73400584
(1 row)

Check for artifacts in Global Storage

Globally stored artifacts will be located in a folder within <bamboo-home>/shared/artifacts/globalStorage. Artifacts on this location had their build results expired (I.e. cleaned up, removed) and are there if there is still a deployment plan referencing it. This way if deployments of older versions will not fail if actioned.

Find largest artifacts in Global Storage
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# du -h ${BAMBOO_HOME}/shared/artifacts/globalStorage --max-depth 1 | sort -rh
12K	/var/atlassian/application-data/shared/artifacts/globalStorage
4.0K	/var/atlassian/application-data/shared/artifacts/globalStorage/8617988
4.0K	/var/atlassian/application-data/shared/artifacts/globalStorage/6520840
4.0K	/var/atlassian/application-data/shared/artifacts/globalStorage/6520836

Check the following documentation for specific queries on how to identify and locate Global Storage artifacts:

Build results and logs

Depending on your workload, build jobs may end up generating tons of logs that may be impacting your disk space threshold. That means that even if your build plan is not using any significant disk space on artifacts, it may be using it on its logs. 

Find largest build logs
# BAMBOO_HOME=/var/atlassian/application-data/bamboo
# find ${BAMBOO_HOME}/shared/builds/*/download-data/build_logs -maxdepth 1 ! -name "." -type d -print0 | xargs -0 -n1000 du -h | sort -rh
704K	/var/atlassian/application-data/bamboo/shared/builds/plan-11927555-JOB1/download-data/build_logs
68K	/var/atlassian/application-data/bamboo/shared/builds/15335425-14712834/download-data/build_logs
32K	/var/atlassian/application-data/bamboo/shared/builds/plan-6062084-JOB1/download-data/build_logs
20K	/var/atlassian/application-data/bamboo/shared/builds/plan-15106049-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/shared/builds/plan-688129-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401-JOB1/download-data/build_logs
16K	/var/atlassian/application-data/bamboo/shared/builds/8388609-8552449/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/shared/builds/plan-11927557-JOB1/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/shared/builds/plan-10649603-JOB1/download-data/build_logs
8.0K	/var/atlassian/application-data/bamboo/shared/builds/6225921-6389761/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/shared/builds/plan-11927555/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/shared/builds/plan-10649603/download-data/build_logs
4.0K	/var/atlassian/application-data/bamboo/shared/builds/plan-10649601-RUN/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-6062084/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401-JSJ/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401-GPFV/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401-CHEC/download-data/build_logs
0	/var/atlassian/application-data/bamboo/shared/builds/plan-1638401-BUIL/download-data/build_logs 

Once you understand which Plans are good candidates for a cleanup, you can adjust individual Plan expiry for that specific plan or even modify your Global expiry settings to be more aggressive.

More information about important Bamboo directories:

Count build results

The build results volume that is stored in the Bamboo database has a very important impact on system performance.

Run this SQL statement to find the number of Build results per Plan. You can then use these numbers to plan a more aggressive individual Plan expiry.

Count build results
SELECT CB.full_key,
       Count(DISTINCT CSR.chainresult_id) AS chainresults
FROM   build B
       JOIN buildresultsummary BRS
         ON B.full_key = BRS.build_key
       JOIN chain_stage_result CSR
         ON BRS.chain_result = CSR.chainresult_id
       JOIN chain_stage CS
         ON B.stage_id = CS.stage_id
       JOIN build CB
         ON CS.build_id = CB.build_id
GROUP  BY CB.full_key
ORDER  BY Count(DISTINCT CSR.chainresult_id) DESC;
  full_key   | chainresults 
-------------+--------------
 MY-PROJ1    |           201
 BAM-TOM     |           113
 LARD-MOM    |           102
 MSP-BA      |            83
 MFP-MVFP    |            62
 BAM-FOO     |            52
 BAM-BOO     |            44
 DRA-MAIN    |            20
 PRJ-PLANKEY |            12
(9 rows)

Look for Plan Branches

Plan branches are linked to a repository and will be generated once a new Branch is created.

It is important to understand if those Plan branches can be cleaned up after use.

Show plan branches
SELECT build_id,
       build_type,
       created_date,
       updated_date,
       full_key
FROM   build
WHERE  marked_for_deletion IS NOT NULL
       AND build_type = 'CHAIN_BRANCH'
ORDER  BY updated_date ASC;

Plan branches without any active expiry

Plan branches need to have their expiry settings explicitly set. So if the plan branch expiry is not configured, even if the Branch is deleted from the remote repository, the Plan branch will still stay around and you will have to delete it manually if expiry is not set.

The following SQL statement helps you to locate the Plan branches without any expiry settings.

Show plan branches without expiry settings
SELECT build.full_key,
       build_definition.build_id,
       build.created_date,
       build.updated_date
FROM   build
       JOIN build_definition
         ON build.build_id = build_definition.build_id
WHERE  build.marked_for_deletion IS NOT NULL
       AND build_definition.xml_definition_data LIKE
           '%<branchRemovalCleanUpEnabled>false</branchRemovalCleanUpEnabled>%'
       AND build_type = 'CHAIN'
ORDER  BY updated_date ASC; 


Solution

Once you have investigated what Plans are the top consumers you can just straight and start deleting them one by one or you can act preventatively and program Global and Plan-based expiry settings.

Global expiry

By using Global expiry, it is important to understand a few technical aspects of it.

When the data is erased

Once you configure the global expiry criteria (and their exceptions) and activate them, the cleanup process will only start:

  • Manually: Click "Run now" over the "Expiry" menu in Settings (manual method)
  • Scheduled: When the "next scheduled run" under the Removal schedule is reached 

If you are concerned that the cleanup might start immediately, you can set a long date in the future under the Removal schedule just in case.

What data is erased

You have the following choices in terms of data that will be cleaned up:

  • Complete build & deployment results, build & release artifacts and all logs (Excludes historical deployment records)
  • Build and release artifacts only
  • Build and deployment result logs only

Bamboo expiry will clean up the matched builds logs and resulting files located within your <bamboo-home>/shared/builds folder.

Please note, the expiry will not clean the working directory for Agents and plans within <bamboo-agent-home>/xml-data/build-dir. That must be managed by the job itself with a Clean working directory task or Plan Configuration >> Job >> Other >> Clean working directory after each build

Check Locating important directories and files to understand how Bamboo stores each type of data.

Cleaning criteria

You can configure the global expiry retention criteria. Bamboo will keep the results as long as they meet the configured criteria.

Individual plan expiry, branch expiry

You can also have per-plan expiry rules that will override the global expiry settings that affect all plans in Bamboo. If you disable build expiry for a plan, that plan's build result data will never be automatically deleted from your Bamboo server. You can select the build result data that will be kept for a plan and for how long this data will be kept (e.g. for reporting purposes) before Bamboo automatically deletes it.

It is also important to understand that your top "Plan" results need to be combined with your "Plan branches" results. If you have lots of branches coming from your linked repositories, you may have to check for their sizes and consider them as a whole. Every artifact generated in a Default/master branch will also be contained in its Plan-branches.

More information here:

You can also delete individual build results for a plan as a one-off, manually. You can use this method if you want to have control over past build results that you wish to delete.

Plan branches expiry

You can also enable Plan-branches expiry, which will delete branches from Bamboo once they are removed from your repository. This will make Bamboo leaner in terms of the number of branches you have. Follow the path below for each Plan you want to be activated:

  • Plan Configuration -> Branches -> Delete plan branch (choose criteria)

Configuring a Plan branch cleanup:

Override global plan expiry report

To get a list of Plans that are overriding your Global expiry settings you can simply go to:

  • Bamboo Administration -> Plans -> Expiry -> See plans with custom expiry settings (under Expiry overrides)

Alternatively, if using the Bamboo UI is too slow or you have too many results to analyse, you can use a REST API call or SQL SELECT statement for that:

REST API show custom plan expiry
$ curl -vvv --user $BAMBOO_ADMIN:$PASSWORD -k https://bamboo.mydomain.net/rest/api/latest/admin/expiry/custom/plan | jq
>
{
  "self": "https://bamboo.mydomain.net/rest/api/latest/admin/expiry/custom/plan?start=0&limit=25",
  "start": 0,
  "limit": 25,
  "results": [
    {
      "planName": "BAM - BOO",
      "planKey": "BAM-BOO",
      "configLink": {
        "href": "https://bamboo.mydomain.net/chain/admin/config/editChainMiscellaneous.action?buildKey=BAM-BOO",
        "rel": "edit"
      },
      "expiryConfig": {
        "expiryTypeNothing": false,
        "expiryTypeResult": true,
        "expiryTypeArtifact": true,
        "expiryBuildLog": true,
        "duration": 0,
        "period": "days",
        "labelsList": "dontexpire",
        "buildsToKeep": 0,
        "maximumBuildsToKeep": 3
      }
    },
SQL show custom plan expiry
SELECT b.full_key ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS is_overwriting_expiry ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS do_not_expire_anything ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE 'no'
       end AS is_expiring_result ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE
                     CASE
                            WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeBuildLog/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
                            ELSE 'no'
                     end
       end AS is_expiring_build_log ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeResult/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
              ELSE
                     CASE
                            WHEN cast((xpath('//custom/buildExpiryConfig/expiryTypeArtifact/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true' THEN 'yes'
                            ELSE 'no'
                     end
       end AS is_expiring_artifacts ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/duration/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS expire_after_days ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/buildsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS minimum_builds_to_keep ,
       CASE
              WHEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) IS NOT NULL THEN cast((xpath('//custom/buildExpiryConfig/labelsToKeep/text()',cast(bd.xml_definition_data AS xml)))[1] AS text)
       end AS labels_to_keep
FROM   build_definition bd
JOIN   build b
ON     (
              bd.build_id = b.build_id)
WHERE  b.build_type = 'CHAIN'
AND    cast((xpath('//custom/buildExpiryConfig/enabled/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'true'
OR     cast((xpath('//custom/buildExpiryConfig/expiryTypeNothing/text()',cast(bd.xml_definition_data AS xml)))[1] AS text) = 'false';

Please note, this may not work due to invalid XML that can be stored as a result of:

If it doesn't we can use this but it does not lay out the data as cleanly for presentation:

SQL show custom plan expiry - simplified
SELECT B.full_key,
       BD.*
FROM   build_definition BD
       JOIN build B
         ON BD.build_id = B.build_id
WHERE  B.build_type = 'CHAIN'
       AND BD.xml_definition_data LIKE
           '%<buildExpiryConfig>%<enabled>true</enabled>%</buildExpiryConfig>%'; 

Cleaning up the temporary location:

Please make sure you are not deleting the <Bamboo-install>/temp folder, instead, delete only the files stored in that location. Deleting the entire folder will affect the Bamboo service and cause downtime.

The temporary file location is used by the JVM as a temporary storage area. Its location is declared on the JVM by the java.io.tmpdir property in the catalina.sh and catalina.bat scripts. Tomcat is configured to use this temporary location rather than its default for security reasons. The temp directory must exist for Tomcat to work correctly. To avoid unexpected events, it is recommended to stop Bamboo before removing the files from that location.


Last modified on Jun 21, 2023

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.