Assess your instance scale using Confluence Cloud Migration Assistant: Database queries and usage metrics
You can assess the scale of your Confluence instance by collecting data about its content, such as the number of users, spaces, pages, and many more. Sharing this data with the Atlassian migration team will allow us to analyze the complexity of your migration and provide some guidance on migration planning.
How it works
The assessments are part of the cloud migration assistants, currently hidden behind a dark feature. When you enable them and run the assessment, we’ll automatically collect the following data:
Database entities: Number, or metadata, of entities, such as spaces or pages
- Macros: Confluence macros, including nested macros, and macros whose cloud equivalents work differently or don't exist at all.
Usage data: for example the number of active users in the past days
Browser metrics: performance and browser metrics, including network speed and quality, based on users' browser
- Traffic distribution: distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments
The data will be saved in a ZIP archive with a number of files. You’ll need to share them with Atlassian for analysis, and also review yourself to get a better idea about your instance.
How we use the data
As part of the assessment process, we'll only use the metadata about your Jira instance that you share with us. This metadata contains no Personally Identifiable Information (PII).
This will help us better understand your data complexity and cloud performance needs, and will allow us to craft a migration strategy and plan that mitigates risk and sets you up for success. Additionally, the metadata collected will help us continually improve our products and tooling.
Running the assessment requires you to enable a feature flag in your Confluence instance. The assessment tool will monitor your instance for 24 hours, after which time it will output several results files. The assessment will not consume system resources during this time.
FAQs
Here are some common questions we’re getting.
Do you collect any identifiable data?
The assessment doesn’t collect any personal identifiable data (PII). Any data that we collect is based on IDs. It’s also not automatically shared with Atlassian – you share the files when you want to.
Do I need Internet connection? Do you call any APIs outside of my network?
The Internet connection is not required to collect the data. We’re also not calling any APIs outside of your instance’s network.
Do I need to run assessments on the production instance or is the testing one enough?
We recommend that you run the assessment on your production instance, because it improves accuracy of our recommendations. We understand that’s not always possible, so here’s a more detailed explanation so you can decide.
Do I need to run the assessment on each Data Center node separately?
No, the assessment collects the data from the entire Data Center instance, and you don't need to run it separately on each node.
You also can't choose a specific node to run it on.
What’s the performance impact on the Confluence instance?
We did performance testing and optimizations to make sure the assessment doesn’t affect your instance. When it’s running, your users can keep doing their work in Confluence.
What’s the user key / ID collected from the user’s browser?
When collecting data from your users' browsers, we collect the user key associated with every user. These keys are hashed on export and never exposed to Atlassian teams. We only use them for grouping, clustering, and creating themes for research and analysis.
Before you begin
We're adding new metrics on a regular basis. To get the most accurate assessment results, update the Confluence Cloud Migration Assistant to the latest version. Some results might not be available at all in earlier versions.
Assess your instance scale
To assess your instance, follow these steps.
1. Enable the assessment
To enable the assessment:
Go to
<Confluence-url>/admin/darkfeatures.action
Add the following flag:
migration-assistant.enable.assess-l1-cloud-tooling.feature
For details, see Enabling dark features in Confluence.
2. Run the assessment
To run the assessment and collect the data:
Open the Confluence Cloud Migration Assistant.
In the Assess your instance data card, select Analyze data.
The assessment will start. We’ll collect the data that includes:
Database entities: Running queries on your database to collect data about your entities.
- Usage data: for example the number of active users in the past days
Browser metrics: We’ll collect performance data from your users' browsers.
- Traffic distribution: distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments
3. Download the results
Once the assessment is complete, select Download ZIP file. The ZIP archive includes the following files.
The files might be different if you're using one of the earlier versions of the assistant. We recommend that you update it to the latest version.
File | Description |
---|---|
confluence-entities-[date].csv | Data about specific entities, retrieved from the database. It helps us understand the scale of your instance and determine the best migration strategy. |
confluence-macros-[date].jsonl | Data about Confluence macros, including information on:
|
confluence-browser-metrics-[date].jsonl | Data from users' browsers on the performance of your instance. It helps us understand what you’ll need in cloud for best performance. |
confluence-usage-metrics-[date].csv | Usage metrics taken from access logs. It helps the Confluence Cloud teams determine the best cloud instances for you. |
confluence-traffic-distribution-[date].csv | Distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments. It can help you find some unusual traffic. |
4. Share the results with Atlassian
Share the results with Atlassian. You'll most likely do it by attaching it to a MOVE ticket or Support request that pointed you at this page.
We will review the output results to better understand the complexity, data shape, performance needs, and risks, if any. These will help inform the creation of your Migration Strategy and Plan.
What data is collected?
Here are some details on the collected data that you can find in the file.
File: confluence-entities-[date].csv
The files includes data about the following entities:
Total number of users
Number of users with a unique username
Number of active users
Number of inactive users
Number of user groups
Total number of spaces by status
Number of attachments per each of top 100 spaces with most attachments
Total attachments size in GB
Metadata of groups with 35k or more user memberships
Number of active user groups
Number of attachments in each of the top 100 pages with most attachments
Number of attachments per space
Maximum number of comments in a page
Number of pages per each of top 100 spaces with most pages
Maximum number of pages in a space
Number of pages with restrictions
Maximum number of restrictions in a page
Maximum number of likes in a page
Maximum number of space permissions in a page
Maximum number of space group permissions in a space
Number of pages per version
Number of pages per status
Number of Jira Issue Macros in each of top 10k pages with most Jira Issue Macros
List of installed apps
Number of media in each of the top 50 pages with most media
Size of the biggest 10 non-personal spaces in GB
Metadata of spaces modified in last 180 days
Number of groups per each of top 100 users with most groups
Number of embedded attachments per each of the top 100 pages with most embedded attachments
Number of current attachments per each of the top 100 pages with most current attachments
Maximum number of space user permissions
Database tables size in GB
Database size in GB
Media size in GB
Number of tables per each of the top 50 pages with most tables
Number of pages per page status
Total attachments size in GB
Tables size in GB per each of the top 50 pages with most tables
Metadata of personal spaces of inactive users
Maximum number of child pages in a page
Maximum depth of child pages in a page
Number of table cells per each of the top 50 pages with most table cells
Metadata of the top 10 non-personal spaces with the largest data size
Metadata of the personal space with the largest data size
File: confluence-macros-[date].jsonl
The file includes information about Confluence macros that won't be fully compatible after the migration. The macros are divided into spaces where they appear, and include the following types:
- differentMacros: Macros whose equivalents exist in cloud, but have some differences in functionality
- notAvailableMacros: Macros that don't exist in cloud and won't work after the migration
- nestedMacros: Macros that have other macros nested inside
For each of the listed macros, you'll be able to view the name, details, count, and page URLs where these macros appear.
File: confluence-browser-metrics-[date].jsonl
user ID: A unique identifier of a user. It’s generated securely and hashed randomly to maintain privacy, while also allowing us to track user interactions.
Browser type and version: Details about the browser, for example Google Chrome, Safari.
Operating system: Details about the operating system, for example Windows, MacOS.
Processor count: Number of processors on the device.
System memory (RAM): Total memory or RAM on the device.
Network download speed: The speed of downloading data.
Network connection quality: The effectiveness of network connection.
Network Latency (RTT): Round Trip Time (RTT) is a measurement of the time it takes for a signal to travel from a user's computer to the Confluence Instance and back. This helps gauge the responsiveness of users' network connection.
File: confluence-usage-metrics-[date].csv
Interactions date: Date when an interaction with the Confluence instance was recorded in the access logs.
Active users per day: Total number of unique users who interacted with the Confluence instance in the past 14 days.
Peak-hour active users per day: Number of unique users who interacted with the Confluence instance at the same time. We obtain it by aggregating user IDs and the corresponding date-hour combinations in the access logs. It provides a snapshot of your instance’s busiest periods.
Node availability and data collection status: Data on the availability of each node (or single node). It also shows the status of data collection.
File: confluence-traffic-distribution-[date].csv
The file includes the % distribution of traffic on different actions performed by users. Here's the list of actions:
- Viewing a page
- Editing a page
- Adding a comment
- Adding and resolving inline comment
- Using quick search
- Using advanced search
- Liking a page or a comment
- Publishing a page
- Creating a draft
- Viewing home page
- Viewing page history
- Viewing a blog
- Adding labels
- Toggling (enabling or disabling) space permissions for a group
- Using CQL (Confluence Query Language) to search by username
- Using CQL to search by random page ID
- Using CQL to search by page title