Overview
When you connect Confluence as a data source, CoAgentor syncs wiki pages from your selected spaces and processes them for use as agent context during meetings. This article covers what gets synced, how content is extracted, and how to use Confluence pages in your contexts.
What Gets Synced
CoAgentor syncs published pages from your selected Confluence spaces. For each page, the following content is extracted:
- Page title (used as a heading for context)
- Paragraph text
- Headings and subheadings
- Bulleted and numbered lists
- Tables (converted to text rows)
- Code blocks (with language labels)
- Info, warning, note, and tip panels
- Collapsible sections (expand macros)
- Task lists (with checkbox status)
- Status badges and Jira issue references
The following content types are not extracted (they don't contain searchable text):
- Images and screenshots
- Embedded videos
- File attachments
- Confluence macros that only render visuals (charts, roadmaps, etc.)
How Content is Processed
Each Confluence page goes through this pipeline:
- Fetch — the page's storage format (Confluence's internal XHTML) is downloaded via the REST API
- Extract — the XHTML is converted to clean, readable plain text
- Chunk — the text is split into 512-token segments with 50-token overlap
- Embed — each chunk is converted to a vector embedding
- Store — embeddings are saved for semantic search during meetings
This is the same pipeline used for all data sources (file uploads, Google Drive, GitHub, HubSpot, Notion). Adding Confluence doesn't require any special setup beyond connecting and selecting spaces.
Change Detection
CoAgentor uses Confluence's page version numbers to detect changes. Each time a page is edited in Confluence, its version number increments. During each sync cycle:
- Pages with a new version number are re-downloaded, re-extracted, and re-embedded
- Pages with unchanged version numbers are skipped (no API calls wasted)
- Pages that no longer exist in the API response are marked as deleted
This means edits to your Confluence documentation are automatically reflected in your agent's context within 6 hours (or immediately if you trigger a manual sync).
Using Confluence in Contexts
After syncing, your Confluence pages appear alongside other data sources when building contexts:
- Go to Data Sources or open the context builder in your agent settings
- Create or edit a context
- Select your Confluence connection as a data source
- Optionally select specific pages, or include the entire connection
Contexts control which data sources are active for which agents and meetings. See Contexts & Scopes for more on how scoping works.
Sync Frequency
Confluence pages sync automatically every 6 hours. You can also trigger a sync manually from the Confluence card on your Integrations page.
The 6-hour interval balances freshness with API rate limits. For most documentation use cases, this is more than sufficient — wiki pages don't change as frequently as, say, CRM records.
Tips for Best Results
Organize by space. Select only the spaces that are relevant to your agents' use cases. A sales agent probably doesn't need your engineering runbooks.
Keep pages focused. Confluence pages that cover a single topic chunk better than sprawling mega-pages. The RAG retrieval works best when each chunk is semantically coherent.
Use headings. Well-structured pages with clear headings produce better chunks, because the heading text provides context for the sections below it.
Check your plan limit. If you're on a limited plan, prioritize the highest-value spaces. You can always change your space selection later — the next sync will adjust.