Confluence is a team content collaboration software. Onna supports Confluence Cloud and Server version 5.7 and up. Onna connects directly with the API to collect all information in native format. The integration collects all data and metadata from an entire Confluence site or individual spaces.
To collect from Confluence on-premise you will need Onna's Discovery app and an Onna Enterprise account. You can request the Onna Discovery application from our Support team, who will also need to whitelist the domain on our end before a collection can be performed. Your Confluence admin should be able to provide the domain for the sites that will be collected.
Please note the application needs to be installed by a user with admin-level access to that machine. If you have 2FA or SSO enabled for your Confluence site, you may need to create a new account without it enabled.
Generally, a server, virtual or physical, is preferred over a desktop or laptop unless the machine will remain unlocked and have no interruptions to its connectivity. In addition, the app must be installed on a machine that is behind the desired firewall and that has constant connectivity to the Confluence server and Internet.
All files are synced, including, but not limited to:
HTML content of the page
Comments on pages
Attachments for the page
Labels for attachments and pages
Ancestors for the page/attachments
Historical information and related metadata, including:
Author of the page
Last updated by/on
Previous Version created by/on
Types of Sync Available
For on-premise collections we only support one-time sync
One-time sync collects information in an account until a specified date. It does not update once collected.
The synchronization scope currently encompasses entire Confluence sites, specific Confluence spaces, and specific Confluence pages.
All files and metadata can be exported in eDiscovery ready format. Load files are available in a dat, CSV, or custom text file.
The following metadata fields are exported:
Space ID (numeric field to identify space in Confluence)
Confluence Space Type
Ancestors for a file
List of Labels
All date related metadata
How to Guide
First, install the app on a machine that is behind the desired firewall and that has constant connectivity to the Confluence server and Internet.
Note: Generally, a server, virtual or physical, is preferred over a desktop or laptop unless the machine will remain unlocked and have no interruptions to its connectivity. The app needs to be installed by a user with admin level access to that machine. If you have 2FA enabled for your Confluence site you may need to create a new account without it enabled.
The app will open onto a login screen similar to the platform's login.
Currently, the workflow you'll have to follow is either:
Creating a new Workspace for your collection
Using an existing Workspace to add a data source
Inside the workspace, next click "Add new Source"
Currently, you can use the app to add a Confluence or Jira source.
Enter the Confluence site's URL as the host. If the site is password-protected, enter your credentials here, including your full username's email. If the site is public, leave username and password blank. (See example below for collection from a public site). Once you've finished entering the details, click 'Connect'.
Note: Confluence sources in Onna do not store usernames/passwords, instead they use JSESSION ID cookies. These credentials will need to be refreshed when the cookie expires. To avoid being frequently prompted to renew credentials, we suggest extending the amount of time the cookie is valid. You can follow the instructions here.
The option for "Collect external links" will attempt to collect and download links on the Confluence page. If the link is not accessible without authentication, it will not collect.
Select the space(s) you would like to sync. To sync all, select "All Spaces".
Once you have clicked "Sync", you will see this integration within your Groups page. You will also see it within your Sources page on the web platform.
Onna will begin to interact with Confluence's API and begin to sync files. Files will be processed and indexed so that all is searchable. A source will indicate that it's syncing during this process.
Confluence pages in Onna
For on-premise Confluence collections we render the pages collected in HTML.
Accessing audit logs
Follow this article for information on viewing source audit logs.
For Confluence on-premise collections, is it necessary to install anything on a server?
Yes, one needs to install an application on a Windows machine with at least 8 GB RAM that is always on and has constant connectivity to the Confluence server and Internet.
Where will the information be stored for an on-premise collection?
The information that you collect using the app will be uploaded to Onna's cloud environment.
What type of login is needed - database or user?
A user account to Confluence with full access to the space(s) that need to be collected.
If my collection runs into an error, what should I do?
Create a support ticket and our team will be happy to assist.
Is it possible to collect archived spaces?
At this time it is not possible to sync archived spaces due to an API limitation. We suggest changing archived spaces to current in order to perform the required collection. Once the collection has successfully completed the spaces can be archived again.
Is it possible to collected restricted spaces or pages?
It's only possible if the account used to create the collection has access to the restricted space or page. Our connector can only see what that user sees, even if that user is an admin, because admins can also have restricted access to a space or page.