Ranking of Search Results
Summary of the ranking method
When displaying the search results, Confluence applies a weighting to each of the content items returned.
To come up with this single weighting, Confluence first combines three separate weightings based on the following factors:
- The content type of the matching item – this includes user profiles, pages, blog posts, images and other attachments, etc. More details below
- The field type in which the matching term was found – this includes title, name, body content, labels, etc. More details below
- The recency of the matching item – that is, when it was created or last modified; this could be today, yesterday, up to 1 week ago, up to 1 year ago, or over 1 year ago. More details below
The item with the heaviest final weighting will appear at the top of the list of search results. All of the other content items will appear below in descending order of weighting.
Weighting by content type
Content type | Weighting |
---|---|
User profile | 1.5 |
Space | 1.5 |
Page | 1.5 |
Blog | 1.3 |
Attachment (files, videos and images) | 1 |
Comment | 1 |
1 |
On Confluence, you're most likely to be searching for knowledge articles, work done by a particular team or person, or ways to collaborate. Our ranking logic prioritises content types such as user profiles, spaces, and pages that are most suitable for these tasks.
Simple example
If your search result returns 3 items – a page, a blog, and a comment – and they are the same in every other way, then they will be ranked in the order shown above.
Weighting based on field type
Field type | Weighting |
---|---|
Title | 2 |
Content | 1 |
Unstemmed title | 1 |
Label | 0 |
Search results that match the title field are twice as important and weighted twice as highly as matches in the body content.
Simple example
If you search for a user's name, the search results will rank the person's user profile above a page that only contains their name in the content. This is because the profile contains the name in the title field. This example assumes the results are the same in every other way.
Weighting based on recency
Last activity | Weighting |
---|---|
Today | 2.01-2.05 |
Yesterday | 1.92-2.01 |
Up to 1 week ago | 1.52-1.92 |
Up to 1 month ago | 1.46-1.52 |
Up to 3 months ago | 1.36-1.46 |
Up to 6 months ago | 1.25-1.36 |
Up to 1 year ago | 1.11-1.25 |
Beyond a year | 1-1.11 |
Recency is based on when an item was created or last modified, whichever happened more recently. Search gives a higher weighting to recently updated items because it assumes this content is likely to be more relevant than idle or older content.
When a content item has not been modified in over a year, we say that it is in a state of "decay". In this state, we assume it is less relevant and it is de-prioritised in the search results in favour of content modified within the last year, and even more so within the last week.
Simple example
- If two documents match in all other ways, then the newer one with be shown first.
- If the two documents being compared are both older than a year then their relative age does not matter.
Confluence uses the Apache Lucene search engine library. Lucene's score calculation has a number of additional terms, not mentioned in the above example. We have simplified the above explanation of search ranking for purposes of illustration. If you are interested, you can see more information in the Lucene documentation.