Assess your instance scale using Confluence Cloud Migration Assistant: Database queries and usage metrics

Still need help?

The Atlassian Community is here for you.

Ask the community

You can assess the scale of your Confluence instance by collecting data about its content, such as the number of users, spaces, pages, and many more. Sharing this data with the Atlassian migration team will allow us to analyze the complexity of your migration and provide some guidance on migration planning.

How it works

The assessments are part of the cloud migration assistants, currently hidden behind a dark feature. When you enable them and run the assessment, we’ll automatically collect the following data:

  • Database entities: Number, or metadata, of entities, such as spaces or pages

  • Macros: Confluence macros, including nested macros, and macros whose cloud equivalents work differently or don't exist at all.
  • Usage data: for example the number of active users in the past days

  • Browser metrics: performance and browser metrics, including network speed and quality, based on users' browser

  • Traffic distribution: distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments

The data will be saved in a ZIP archive with a number of files. You’ll need to share them with Atlassian for analysis, and also review yourself to get a better idea about your instance.

How we use the data

As part of the assessment process, we'll only use the metadata about your Jira instance that you share with us. This metadata contains no Personally Identifiable Information (PII).

This will help us better understand your data complexity and cloud performance needs, and will allow us to craft a migration strategy and plan that mitigates risk and sets you up for success. Additionally, the metadata collected will help us continually improve our products and tooling.

Running the assessment requires you to enable a feature flag in your Confluence instance. The assessment tool will monitor your instance for 24 hours, after which time it will output several results files. The assessment will not consume system resources during this time.

FAQs

Here are some common questions we’re getting.

Do you collect any identifiable data?

The assessment doesn’t collect any personal identifiable data (PII). Any data that we collect is based on IDs. It’s also not automatically shared with Atlassian – you share the files when you want to.

Do I need Internet connection? Do you call any APIs outside of my network?

The Internet connection is not required to collect the data. We’re also not calling any APIs outside of your instance’s network.

Do I need to run assessments on the production instance or is the testing one enough?

We recommend that you run the assessment on your production instance, because it improves accuracy of our recommendations. We understand that’s not always possible, so here’s a more detailed explanation so you can decide.

Data shape

Data shape

When it comes to data shape, which is – for example – the number of spaces, users, entities exceeding the guardrails, and so on, the testing instance should be sufficient. Even if it's behind the production instance by a month, the data shape will be relatively similar, so we'll be able to assess it accurately. The production instance is still better, but the testing one should be enough.

Performance: Traffic, usage, and instance metadata

Performance: Traffic, usage, and instance metadata

Apart from just data, we also assess the traffic and usage of your instance, based on the data from your users' browsers. We also collect instance metadata, which is – for example – hardware details or network speed. All of these metrics can affect performance and we use it to determine whether your target cloud site is sufficient for your needs. If it’s not, we might recommend splitting your instance into multiple sites in cloud.

Assessing your testing instance won’t give accurate (or any) results on traffic, usage, and instance metadata, which are important factors.

Do I need to run the assessment on each Data Center node separately?

No, the assessment collects the data from the entire Data Center instance, and you don't need to run it separately on each node.

You also can't choose a specific node to run it on. 

What’s the performance impact on the Confluence instance?

We did performance testing and optimizations to make sure the assessment doesn’t affect your instance. When it’s running, your users can keep doing their work in Confluence.

What’s the user key / ID collected from the user’s browser?

When collecting data from your users' browsers, we collect the user key associated with every user. These keys are hashed on export and never exposed to Atlassian teams. We only use them for grouping, clustering, and creating themes for research and analysis.

Before you begin

We're adding new metrics on a regular basis. To get the most accurate assessment results, update the Confluence Cloud Migration Assistant to the latest version. Some results might not be available at all in earlier versions.

Assess your instance scale

To assess your instance, follow these steps.

1. Enable the assessment

To enable the assessment:

  1. Go to <Confluence-url>/admin/darkfeatures.action

  2. Add the following flag: 

    migration-assistant.enable.assess-l1-cloud-tooling.feature

For details, see Enabling dark features in Confluence.

2. Run the assessment

To run the assessment and collect the data:

  1. Open the Confluence Cloud Migration Assistant.

  2. In the Assess your instance data card, select Analyze data.

  3. The assessment will start. We’ll collect the data that includes:

    • Database entities: Running queries on your database to collect data about your entities.

    • Usage data: for example the number of active users in the past days
    • Browser metrics: We’ll collect performance data from your users' browsers.

    • Traffic distribution: distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments

3. Download the results

Once the assessment is complete, select Download ZIP file. The ZIP archive includes the following files.

The files might be different if you're using one of the earlier versions of the assistant. We recommend that you update it to the latest version.

FileDescription
confluence-entities-[date].csvData about specific entities, retrieved from the database. It helps us understand the scale of your instance and determine the best migration strategy.
confluence-macros-[date].jsonl

Data about Confluence macros, including information on:

  • Macros with other macros nested inside
  • Macros whose equivalents exist in cloud, but have some differences in functionality
  • Macros that don't exist in cloud, and are therefore incompatible
confluence-browser-metrics-[date].jsonlData from users' browsers on the performance of your instance. It helps us understand what you’ll need in cloud for best performance.
confluence-usage-metrics-[date].csvUsage metrics taken from access logs. It helps the Confluence Cloud teams determine the best cloud instances for you.
confluence-traffic-distribution-[date].csv

Distribution of traffic (%) on different actions done by your users, for example viewing pages or adding comments. It can help you find some unusual traffic.

4. Share the results with Atlassian

Share the results with Atlassian. You'll most likely do it by attaching it to a MOVE ticket or Support request that pointed you at this page.

We will review the output results to better understand the complexity, data shape, performance needs, and risks, if any. These will help inform the creation of your Migration Strategy and Plan.

What data is collected?

Here are some details on the collected data that you can find in the file.


File: confluence-entities-[date].csv

The files includes data about the following entities:

  • Total number of users

  • Number of users with a unique username

  • Number of active users

  • Number of inactive users

  • Number of user groups

  • Total number of spaces by status

  • Number of attachments per each of top 100 spaces with most attachments

  • Total attachments size in GB

  • Metadata of groups with 35k or more user memberships

  • Number of active user groups

  • Number of attachments in each of the top 100 pages with most attachments

  • Number of attachments per space

  • Maximum number of comments in a page

  • Number of pages per each of top 100 spaces with most pages

  • Maximum number of pages in a space

  • Number of pages with restrictions

  • Maximum number of restrictions in a page

  • Maximum number of likes in a page

  • Maximum number of space permissions in a page

  • Maximum number of space group permissions in a space

  • Number of pages per version

  • Number of pages per status

  • Number of Jira Issue Macros in each of top 10k pages with most Jira Issue Macros

  • List of installed apps

  • Number of media in each of the top 50 pages with most media

  • Size of the biggest 10 non-personal spaces in GB

  • Metadata of spaces modified in last 180 days

  • Number of groups per each of top 100 users with most groups

  • Number of embedded attachments per each of the top 100 pages with most embedded attachments

  • Number of current attachments per each of the top 100 pages with most current attachments

  • Maximum number of space user permissions

  • Database tables size in GB

  • Database size in GB

  • Media size in GB

  • Number of tables per each of the top 50 pages with most tables

  • Number of pages per page status

  • Total attachments size in GB

  • Tables size in GB per each of the top 50 pages with most tables

  • Metadata of personal spaces of inactive users

  • Maximum number of child pages in a page

  • Maximum depth of child pages in a page

  • Number of table cells per each of the top 50 pages with most table cells

  • Metadata of the top 10 non-personal spaces with the largest data size

  • Metadata of the personal space with the largest data size


File: confluence-macros-[date].jsonl

The file includes information about Confluence macros that won't be fully compatible after the migration. The macros are divided into spaces where they appear, and include the following types:

  • differentMacros: Macros whose equivalents exist in cloud, but have some differences in functionality
  • notAvailableMacros: Macros that don't exist in cloud and won't work after the migration
  • nestedMacros: Macros that have other macros nested inside

For each of the listed macros, you'll be able to view the name, details, count, and page URLs where these macros appear.


File: confluence-browser-metrics-[date].jsonl

  • user ID: A unique identifier of a user. It’s generated securely and hashed randomly to maintain privacy, while also allowing us to track user interactions.

  • Browser type and version: Details about the browser, for example Google Chrome, Safari.

  • Operating system: Details about the operating system, for example Windows, MacOS.

  • Processor count: Number of processors on the device.

  • System memory (RAM): Total memory or RAM on the device.

  • Network download speed: The speed of downloading data.

  • Network connection quality: The effectiveness of network connection.

  • Network Latency (RTT): Round Trip Time (RTT) is a measurement of the time it takes for a signal to travel from a user's computer to the Confluence Instance and back. This helps gauge the responsiveness of users' network connection.


File: confluence-usage-metrics-[date].csv

  • Interactions date: Date when an interaction with the Confluence instance was recorded in the access logs.

  • Active users per day: Total number of unique users who interacted with the Confluence instance in the past 14 days.

  • Peak-hour active users per day: Number of unique users who interacted with the Confluence instance at the same time. We obtain it by aggregating user IDs and the corresponding date-hour combinations in the access logs. It provides a snapshot of your instance’s busiest periods.

  • Node availability and data collection status: Data on the availability of each node (or single node). It also shows the status of data collection.


File: confluence-traffic-distribution-[date].csv

The file includes the % distribution of traffic on different actions performed by users. Here's the list of actions:

  • Viewing a page
  • Editing a page
  • Adding a comment
  • Adding and resolving inline comment
  • Using quick search
  • Using advanced search
  • Liking a page or a comment
  • Publishing a page
  • Creating a draft
  • Viewing home page
  • Viewing page history
  • Viewing a blog
  • Adding labels
  • Toggling (enabling or disabling) space permissions for a group
  • Using CQL (Confluence Query Language) to search by username
  • Using CQL to search by random page ID
  • Using CQL to search by page title
Last modified on Nov 29, 2024

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.