Non english characters, umlauts and diaeresis missing or appear as boxes in Confluence Data Center PDF export

Still need help?

The Atlassian Community is here for you.

Ask the community

Platform Notice: Data Center - This article applies to Atlassian products on the Data Center platform.

Note that this knowledge base article was created for the Data Center version of the product. Data Center knowledge base articles for non-Data Center-specific features may also work for Server versions of the product, however they have not been tested. Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Except Fisheye and Crucible

Summary

When exporting a Confluence page out to PDF, on english characters, umlauts and diaeresis might appear missing, or as boxes or even garbled in the PDF content.

Environment

Confluence Data Center installation on a Linux Server.


Diagnosis

  1. The following encoding are already set to UTF-8
    1. -Dsun.jnu.encoding=UTF-8
    2. -Dfile.encoding=UTF-8
  2. Database encoding set correctly.
  3. Issue does not persist when the PDF Conversion Sandbox process for Confluence Data Center is disabled.
    • -Dpdf.export.sandbox.disable=true


Cause

  1. The PDF conversion process is Confluence Data Center is controlled by a separate sandbox process, even on a DC instance running on a single node.
  2. In some cases this process might not pick up the encoding that is set for Confluence, and we will need to manually parse these valus.


Solution

  1. Add the following parameter to the setenv.sh file for each node and restart Confluence.
    • CATALINA_OPTS="-Dconversion.sandbox.java.options=-Xmx512m,-Xss2m,-Dsun.jnu.encoding=UTF-8,-Dfile.encoding=UTF-8 ${CATALINA_OPTS}"




Last modified on Feb 17, 2020

Was this helpful?

Yes
No
Provide feedback about this article
Powered by Confluence and Scroll Viewport.