Bamboo: Right to erasure
Under Article 17 of the GDPR, individuals have the right to have personal data erased. This is also known as the ‘right to be forgotten’. The right is not absolute and only applies in certain circumstances. Whether or not you are required to honor an individual's request to have personal data deleted will vary on a case-by-case basis, and is determination you should always make with the assistance of legal counsel. Once you have determined you have an obligation to delete personal data, we have provided the following instructions on how to do so within certain Atlassian products.
Personal data stored within the product can be divided into one of two areas: 1) account-level personal data; and 2) free-form text. Account-level personal data are data fields that exist within the product for the sole purpose of identifying an individual throughout the product. Examples of account-level personal data include the user's display name, profile picture or avatar and email address. These data elements are generally visible from the user's profile and are used throughout the product to point back to the user's profile when the user is @mentioned or tagged on in certain spaces or content. Deleting account-level personal data elements will automatically remove those data elements throughout the product where the relevant account-level data elements appear and in the database (subject to some limitations discussed below).
If you have included personal data in free-form text, either typed into content spaces or as a custom field label, you will need to use the product's global search feature to surface this personal data and delete it on a case-by-case basis.
Locating and Accessing Personal Data in Bamboo
Personal Data (PD) is stored in Bamboo in one of four ways:
- Structured PD: data in user profiles
- Unstructured PD: data associated with Bamboo builds, results, deployment projects, environments, versions - free text
- Filesystem PD on the server: other data stored on a server (build result logs, artifacts, audit logs, global entities, configuration etc.)
- Filesystem PD on the agent: other data stored on the agent (build result logs, caches, artifacts)
Structured PD
User profiles contain specific PD elements used to represent users in Bamboo system.
This data is mainly used in:
- profile page (https://confluence.atlassian.com/bamboo/managing-your-user-profile-289277031.html)
- REST API (https://developer.atlassian.com/server/bamboo/rest-apis/)
- user picker
- user authentication purpose (unsuccessful login attempts number, password reset token, remember me)
- user permissions
- repository commit author
- result comments author
- notifications
- user responsible for result failure
- favorite builds (kept per user)
- deployment version approver
- author of change in the audit log
User profiles hold the following PD elements:
User profile data | Description |
---|---|
Full name | Text used to represent a user in Bamboo interface. All links to user profile will be using this text. In many cases, it is holding PD such as name and surname. |
User name / login | Text representing a person during login. It is used internally in a database to correlate additional data with a user profile. It can be also visible in some REST and pages URL. |
Email associated with a user account. Accessible on the user profile. | |
IM | IM address associated with user IM account. Accessible on the user profile. |
Unstructured PD
PD could also be stored in free-form text data fields. Because these fields allow any content, topic or label, they may or may not contain PD, depending on the instance configuration.
Domain Objects (Plans, Results, Deployment projects, Releases) - and associated entities can hold any type of information, as they can contain many free text values.
Global entities (project descriptions, variables, repositories, shared credentials, other configuration etc.) can hold free text values.
Incidental PD
Various processes that run within or alongside Bamboo may store PD incidental to their functions. Below are is a list of processes that may store PD incidentally.
Filesystem
Lucene index
To speed up searching Bamboo uses Lucene library (search index). This index will duplicate some information from the DB and store it into a filesystem. When SQL queries are executed against DB there's a risk that stale data will remain in the Lucene index (e.g. authors in the build results index, or project/plan names and descriptions in quick search index). In order to refresh Lucene index, reindex needs to be performed. See https://confluence.atlassian.com/display/BAMBOO/Reindexing+data.
Lucene indexes are located in the ${bamboo_home}/index
directory.
If reindexing is not possible, selected documents could be searched and deleted using this tool: https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/luke/lukeall-3.5.0.jar
Artifacts
Placement of artifacts depends on artifact handler that was used for plan result (or global artifact handler if it was not set for a specific plan).
The most popular artifact handler is Bamboo Remote Handler - artifacts are stored on Bamboo server and are located in the ${bamboo_home}/artifacts
directory.
Other popular artifact handler is Amazon S3 Handler - artifact are stored on Amazon S3 servers, location is configured in administration panel in Bamboo.
To read more about artifact handlers and its configuration, see: https://confluence.atlassian.com/display/BAMBOOSERVERM/Artifact+handlers
Server Logs
Name | Location | DP details |
---|---|---|
Bamboo server logs | ${bamboo_home}/log/* , ${bamboo_install}/logs/catalina.out | Can contain arbitrary data (hard to tell because of possible extensive logging) |
Bamboo build logs | {bamboo_home}/xml-data/builds/JOB_KEY/* | Information specific to all builds, can contain arbitrary data |
Analytics logs | ${bamboo_home}/analytics-logs/* | Generally should not contain PD |
Access logs | ${bamboo_install}/logs/access_log.* | Can contain username/ip address and URL of accessed resources. |
Tomcat logs | ${bamboo_install}/logs/* | Might contain some PD. |
Other server logs | ${bamboo_home}/log/*, ${bamboo_install}/logs/* | Might contain some PD. |
To read more about logging in Bamboo, see https://confluence.atlassian.com/bamboo/logging-in-bamboo-289277239.html
Memory
Bamboo caches
In order to speed up certain actions, Bamboo uses internal caches that make DB calls unnecessary.
Certain data from DB is cached in memory to speed up things. It's inaccessible for users directly, used by the system to speed serving of the data.
It's recommended to update DB with manual SQL queries only while the Bamboo server is stopped, otherwise cached data being different than data in DB may lead to data inconsistency.
Agents
Remote agent
All remote agent activity is recorded in the atlassian-bamboo-agent.log
file stored on the agent machine in the running directory of the agent. The running directory can be viewed in the remote agent's system properties in the Bamboo Paths section. These logs can contain arbitrary data, and in general, they do not contain PD used by Bamboo.
When the agent is performing builds, it stores data in ${bamboo_agent_home}/xml-data/build-dir/
JOB_KEY/*
. The default name of the Bamboo agent home directory is bamboo-agent-home
and its location depends on your operating system. To read more about it, check Bamboo agent home directory section here: https://confluence.atlassian.com/bamboo/locating-important-directories-and-files-289277247.html
Elastic agent
All elastic agent activity is logged inside the elastic instance where the elastic agent runs. By default, it's stored in two files: atlassian-bamboo.log
and bamboo-elastic-agent.out
, but it depends very much on elastic image configuration. It will also depend on the operating system of the elastic agent.
Builds data on the elastic agent is stored in the same way it's stored on remote agent.
To read more about elastic agent logs, see here: https://confluence.atlassian.com/bamboo/viewing-an-elastic-instance-289277134.html.
External storage
Backups
It's up to you to define purpose/retention policy for backed up files. Bamboo just generates the backup to be used by the end user. See more: https://confluence.atlassian.com/bamboo/exporting-data-for-backup-289277255.html.
Deleting and/or Modifying PD in Bamboo
Once you've identified where PD may be stored in your Bamboo instance, this section describes how to delete or modify that PD.
Workaround
Follow best practices for Change Management - test and validate these settings in a Test/Development and Staging environment prior to rolling any changes in a Production environment. You must test and validate these changes to ensure that they will function well within your infrastructure prior to placing these changes in production.
Deleting or modifying PD
Deleting and modifying user PD is virtually the same process. This is because we do not recommend deleting an entire user account from Bamboo. They are an integral part of Bamboo data structure and critical for maintaining data consistency of our system.
Rather than deleting the data, we recommend modifying PD elements in the account to display elements that do not identify the user. For example, replacing the username johnsmith with deleteduser1. This way the system will be able to properly function while allowing you to remove profile-level PD that otherwise could identify the user. You can also use this process if you are simply looking to modify a user's PD - for example, if nicholassmith is actually nicksmith.
Modifying user PD
Modifying user data PD has to be performed in several steps, depending on where data are stored.
To modify user data:
- Handle PD in "structured" data fields
- (UI) Modify data in user profile - this step depends on the type of Directory that Bamboo is using for managing users.
- (SQL) Optionally, modify "username" - only if "username" contains PD (SQL update statements have to be executed against stopped Bamboo instance)
- Handle PD in "free-form text" data fields
- (SQL) handle PD in other entries (SQL update statements have to be executed against the stopped Bamboo instance).
- After change actions (only if SQL update statements were executed)
- Reindex Bamboo. See Reindexing data.
Handle PD in "structured" data fields
Modify PD in user profile - external user directory
Modify PD in user profile - internal User Directory
How to modify PD in user profile using internal directory
Modifying username (Optional - only when username contains PD)
This could possibly break the third party plugin that could reference username.
Handle PD in "free-form text" data fields
After change actions (if SQL update statements were executed)
If SQL update statements were executed you will have to reindex Bamboo.
- Reindex Bamboo - Lucene reindex is required because some data are stored and read from Lucene index and after updating DB Lucene index could contain stale data. Reindexing data
Version Compatibility
All workarounds are compatible with Bamboo 6.5 and later.
Limitations
SQL statements are using pattern matching so they require manual inspection before each update.
MySQL doesn't have the REGEXP_REPLACE function (or any other functions that would work in a similar manner) so we are able to find matching records ignoring case, but we are not able to generate SQL that will update values in a case-insensitive way. Manual inspection/update is needed.
- Microsoft SQL Server does not support regular expressions to the extent other supported databases - records are matched using the LIKE operator which can match longer substrings. In updates "replace" function is used, which in conjunction with case-insensitive collation will replace all occurrences case-insensitive to case-sensitive replacement eg. replace("and TEST second as test third", "test", "tESt") = "and tESt second as tESt third".
Data could be stored inside third-party plugins and not discovered/altered/deleted via querying DB (plugin tables are not scanned for PD)
Additional notes
There may be limitations based on your product version.
Note, the above-related GDPR workaround has been optimized for the latest version of this product. If you are running on a legacy version of the product, the efficacy of the workaround may be limited. Please consider upgrading to the latest product version to optimize the workarounds available under this article.
Third-party add-ons may store personal data in their own database tables or on the filesystem.
The above article in support of your GDPR compliance efforts applies only to personal data stored within the Atlassian server and data center products. To the extent you have installed third-party add-ons within your server or data center environment, you will need to contact that third-party add-on provider to understand what personal data from your server or data center environment they may access, transfer or otherwise process and how they will support your GDPR compliance efforts.
If you are a server or data center customer, Atlassian does not access, store, or otherwise process the personal data you choose to store within the products. For information about personal data Atlassian processes, see our Privacy Policy.