Migration from Wiki Markup to XHTML-Based Storage Format
If you are upgrading to Confluence 4.0 or later from an older version (From Confluence 3.5.x or earler) then as part of the upgrade an automatic migration of your content will take place. This is a non-destructive process. Your existing content is not overwritten. Instead, the migration process will create a new version of each wiki markup page. The new version will use the new XHTML-based storage format, so that you can edit the page in the Confluence rich text editor.
In addition, if you are upgrading to Confluence 4.3 or later from an older version then as part of the upgrade an automatic migration of your page templates will take place. See Migration of Templates from Wiki Markup to XHTML-Based Storage Format.
Note: Even though the process is non-destructive, you must be sure to perform a backup of your database and home directory prior to starting the new version of Confluence, as we recommend for any Confluence upgrade.
Migration process
Depending on the size of your Confluence installation, the migration from wiki markup to the new XHTML-based storage format could prove time consuming. The duration of the migration is difficult to estimate; this is due to a number of site specific factors. As a rough guide, a test dataset we migrated was 130,000 pages, totalling approximately 700Mb, which took six minutes.
On this page:
Related pages:
The following properties that can be modified to allow finer control over the migration process:
Property | Purpose | Default |
---|---|---|
confluence.wiki.migration.threads | The number of concurrent worker threads migrating content | 4 |
confluence.wiki.migration.batch.size | The number of items migrated in each batch of work | 500 |
confluence.wiki.migration.versioncomment | The comment associated with the newly migrated version of each piece of content | "Migrated to Confluence 4.0" |
(For instructions on setting Confluence system properties see this document.)
Again, due to the large variability in Confluence installations it is hard to give specific recommendations for the above settings. One point to note though that both increasing batch size and the number of threads (or both) will increase the peak memory required for migration. If memory is an issue then as you increase one of these settings consider decreasing the other.
Another factor to be aware of if modifying these defaults is that of the cache settings employed in your site. The migration will quickly populate certain Confluence caches so be sure that if you have customised caches as described here that there is enough memory on the server for these caches should they reach maximum capacity.
Watching the migration logs during the upgrade
To monitor the progress of a site migration you should watch the output in the application log.
Typical logging progress will be shown by multiple log entries at the INFO level of the following format:
WikiToXhtmlMigrationThread-n - Migrated 2500 of 158432 pages, this batch migrated 500/500 without error
There may be a wide array of messages logged from each individual page but any errors are also collected for display in a single migration report once all content has been processed. Here is a typical example of such a report:
Wiki to XHTML Exception Report:
Summary:
0 settings values failed.
0 PageTemplates failed.
2 ContentEntityObjects failed.
Content Exceptions:
1) Type: page, Id: 332, Title: Release Notes 1.0b3, Space: DOC - Confluence 4.0 Beta. Cause: com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro link is unknown.. Message: The macro link is unknown.
2) Type: comment, Id: 6919, Title: null, Global Scope. Cause: com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro mymacro is unknown.. Message: The macro mymacro is unknown.
Each entry in the report will identify the content that caused migration exceptions as well as displaying the exceptions themselves.
In almost all cases any content reported as errored will have been migrated to the new XHTML-based storage format, but will actually consist of wiki markup content wrapped within an XML 'unmigrated-wiki-markup' macro. This content will still be viewable in Confluence and editable within the new Confluence Editor.
However, in some cases a batch of content may actually have completely failed to migrated. This is most typically due to an unhandled exception causing a database transaction rollback. This would be reported in the log with a message like this:
Unable to start up Confluence. Fatal error during startup sequence: confluence.lifecycle.core:pluginframeworkdependentupgrades (Run all the upgrades that require the plugin framework to be available) - com.atlassian.confluence.content.render.xhtml.migration.exceptions.MigrationException: java.util.concurrent.ExecutionException: org.springframework.transaction.UnexpectedRollbackException: Transaction rolled back because it has been marked as rollback-only
Confluence provides no further report about this scenario and will also allow Confluence to restart as normal without retrying a migration. If a user tries to view any such unmigrated content they will see an exception similar to this:
java.lang.UnsupportedOperationException: The body of this ContentEntityObject ('Page Title') was 'WIKI' but was expected to be 'XHTML'
The solution is to ensure you manually re-run the site migration after the restart.
Re-running the migration – for content that completely failed the migration
A Confluence Administrator can restart the site migration if there was any content that failed migration (see previous section). Only the content that is still formatted in wiki markup will be migrated, so typically a re-migration will take less time than the original migration.
To manually re-run migration:
- Open this URL in your browser:
<Confluence Address>/admin/force-upgrade.action
- Select wikiToXhtmlMigrationUpgradeTask in the Upgrade task to run dropdown list.
- Choose Force Upgrade.
Re-attempting the migration – for content in 'unmigrated-wiki-markup' macro
The previous section was about dealing with the exceptional circumstance where certain content was left completely unmigrated. The most common migration problem is that the content was migrated but remains formatted as wiki markup on the page, within the body of an 'unmigrated-wiki-markup' macro. Any content which is referenced in the migration report will be found in this state. This content is still viewable and editable but since it is wiki markup it cannot be edited using the full feature set of the rich text editor.
The most common reason for content to be in this state is that the page contains an unknown macro, or a macro that is not compatible with Confluence 4.x.
There are two possible fixes for this situation:
- Install a version of the macro that is compatible with Confluence 4.x. See Plugin Development Upgrade FAQ for 4.0 .
- Edit the page and remove the problematic macro.
Regardless of the solution you choose, you can then force a re-migration of all the content (including content in templates) that was left wrapped in an 'unmigrated-wiki-markup' macro. This feature is found at
<Confluence Address>/
admin/unmigratedcontent.action
Notes
We refer to the Confluence storage format as 'XHTML-based'. To be correct, we should call it XML, because the Confluence storage format does not comply with the XHTML definition. In particular, Confluence includes custom elements for macros and more. We're using the term 'XHTML-based' to indicate that there is a large proportion of HTML in the storage format.