Data Sources Overview

Give your agents access to your documents, knowledge bases, and external tools — so they can reference your content during live meetings.

4 min read

Data Sources Overview

Data sources let you connect your documents, files, and external knowledge bases to CoAgentor so your agents can reference them during live meetings. When a trigger fires, the agent doesn't just respond generically — it draws on your specific content.

Manage data sources →


How it works

CoAgentor uses a Retrieval-Augmented Generation (RAG) pipeline to make your content available to agents at meeting time:

  1. You upload a file or connect an external data source
  2. CoAgentor processes the content — extracting text, splitting it into chunks, and generating vector embeddings
  3. When a trigger fires during a meeting, CoAgentor searches your indexed content for the most relevant passages based on the current conversation
  4. The retrieved passages are injected into the AI's context alongside the transcript, grounding the response in your actual content

This means your agents can answer questions by referencing your product documentation, pricing sheets, playbooks, knowledge base articles, or any other content you've connected — not just from general AI training.

XXX SCREENSHOT: A diagram showing the flow: File upload → Chunking/Embedding → Meeting trigger fires → Relevant chunks retrieved → Injected into agent response context


Available data sources

File Upload

Upload files directly from your computer. Supports PDF, DOCX, XLSX, CSV, TXT, Markdown, code files, and many more formats. Files are processed immediately after upload.

Learn more about file uploads →

Google Drive

Connect your Google Drive to browse and select files in the context picker. CoAgentor watches for changes via push notifications and re-syncs automatically when files are updated.

Connect Google Drive → · Using Drive files as context →

GitHub

Connect your GitHub account to sync repository files — documentation, code, configs, READMEs. Daily background sync detects changes via SHA comparison so your agent always has the latest version.

Connect GitHub → · Using GitHub files as context →

HubSpot

Connect your HubSpot CRM to sync contacts, companies, and deals. CRM records are formatted as structured plaintext optimised for retrieval. Configure sync filters to include only the records that matter. Syncs every 6 hours.

Connect HubSpot → · Using HubSpot data as context →

Confluence

Connect your Atlassian Confluence workspace to sync wiki pages from selected spaces. Documentation, runbooks, process guides, and knowledge base articles are converted from Confluence's storage format to clean plaintext. Syncs every 6 hours with version-based change detection.

Connect Confluence → · Using Confluence content as context →

Notion

Connect your Notion workspace to sync pages and databases. Page content is extracted recursively (headings, lists, tables, code blocks, toggle content) and database rows are serialised as structured property blocks.

Note: The Notion integration is built and functional but is currently awaiting Notion's third-party app approval. It will become available to all users once approved.

Connect Notion → · Using Notion content as context →


The two-layer model: Data Sources and Contexts

Data sources and contexts work together but serve different purposes:

  • A data source is the storage layer — where files live and how they are indexed. You connect or upload content once, and it is available to use across any context.
  • A context is the retrieval policy — a named collection of files that defines which content is active for a given meeting, agent, or scope.

This separation means a single file can be reused across multiple contexts — your product documentation might appear in both a "Sales" context and a "Support" context, without uploading it twice. Adding a new connector only requires changes to the storage layer; the context and retrieval pipeline is connector-agnostic.

Learn about Contexts and Scopes →


Plan limits

The number of items you can index varies by plan and data source type. Some connectors (Google Drive, Notion) are unlimited on all plans. Others are tiered to match typical usage patterns at each plan level.

Data sourceFreeSoloPro
Uploaded files310Unlimited
Google Drive filesUnlimited (all plans)
GitHub files1050Unlimited
HubSpot CRM records50500Unlimited
Confluence pages25250Unlimited
Notion pagesUnlimited (all plans)
Max file size10 MB50 MB200 MB
Storage capacity25 MB500 MB10240 MB

View full plan comparison →

Related articles

Still have questions?

Our team typically responds within one business day.

Contact us →