Enterprise Knowledge Base Search with AI
Employees waste hours every week searching for information that exists somewhere in the organization but is impossible to find. We build AI retrieval systems that answer natural language questions accurately, with sources cited.
The Challenge
At a 2,800-person professional services firm, institutional knowledge lives in 140 SharePoint sites, a legacy Confluence instance nobody migrated off, Google Drive folders owned by people who left two years ago, a ServiceNow KB, and Slack channels whose search is indexed up to a point and then not. A new hire's first 90 days is mostly asking a senior colleague where something is. A senior consultant loses 3-4 hours a week to information-finding, confirmed through a time study. Brand-new questions get answered on a first-principles basis because the last team that answered the same question left no searchable artifact. The firm has tried SharePoint Enterprise Search twice, which returns keyword matches that technically contain the search terms and none that answer the question. People have stopped trusting search and default to asking in Teams.
Our Approach
A Retrieval-Augmented Generation system built on Claude Sonnet 4.5, OpenAI text-embedding-3-large, and Pinecone connects to your existing knowledge sources via their native APIs (Microsoft Graph for SharePoint and OneDrive, Confluence REST, Google Drive, Notion, ServiceNow KB, and Slack search export). Documents are chunked at paragraph level with overlap, embedded into Pinecone, and tagged with source metadata: owner, last modified, access control list, document type. A query agent rewrites the user's question for retrieval, pulls the top 15 candidate chunks, re-ranks with a cross-encoder, synthesizes an answer using only retrieved content, and cites specific sources with deep-links. Access permissions are enforced at query time against Azure AD and Google Workspace, so users only see answers from documents they can access. Unanswered questions feed a knowledge-gap report delivered weekly to content owners.
How We Do It
Knowledge Source Indexing
We connect to your document sources through their APIs: Microsoft Graph for SharePoint/OneDrive, Confluence REST, Google Drive API, Notion, ServiceNow KB, GitHub Wiki, and internal PDF repositories via direct scrape. Documents are chunked at semantic boundaries (paragraph, section, table) with 10-15% overlap, embedded with OpenAI text-embedding-3-large, and stored in Pinecone with metadata: source system, owner, created and modified dates, ACL, language, document type. Initial indexing runs in batches; incremental sync runs every 15-30 minutes using change webhooks or delta queries. Failure mode: a document references another by link and the link is to a system we can't index (external vendor portal). The system flags unresolvable references rather than indexing the placeholder text.
Natural Language Search Interface
Users ask questions in plain language through a chat UI, a Slack or Teams bot, or an embedded widget in SharePoint. The query agent rewrites ambiguous questions into retrieval-friendly form (expanding acronyms, adding synonyms from a company glossary), retrieves the top 15 chunks from Pinecone, re-ranks with a cross-encoder (Cohere Rerank v3 or equivalent) to reorder by true relevance, and then passes the top 5-8 chunks to Claude Sonnet 4.5 for synthesis. The response cites specific documents with deep-links and confidence level. Multi-turn conversations maintain context for follow-ups. Failure mode: the user asks something truly novel with no coverage in the knowledge base. The system returns 'I don't have enough to answer this confidently' and logs the gap.
Access Control and Permission Enforcement
Permissions are enforced at query time, not index time. When a user queries, the system resolves their identity (SSO via Azure AD, Okta, or Google Workspace), pulls their group memberships, and filters Pinecone results to documents the user has read access to. For SharePoint, we query Graph API for effective permissions; for Google Drive, the Drive permissions API; for Confluence, the space permissions API. A user who cannot read a document in the source system cannot receive answers sourced from it. Failure mode: permissions change after indexing (user removed from a group, document ACL updated). The next query picks up the change because permissions are resolved live, not cached.
Gap Analysis and Content Improvement
Every low-confidence answer, every 'I don't know' response, and every user thumbs-down writes to a gap log. A weekly analysis groups related unanswered questions (e.g. 23 variations of 'what is our policy on remote work from international locations') and surfaces them to the relevant knowledge owner with sample questions, a proposed article outline, and a one-click 'claim this topic' action. Answered questions that generate high re-query rates (users asking similar questions repeatedly) signal knowledge that exists but is hard to find. Failure mode: the gap report goes to a distribution list that nobody actually owns. We track response rate on gap reports per owner and escalate stale ones.
What You Get
Where this fits — and where it doesn't
Good fit when
- ✓Organizations with 500+ employees where knowledge is genuinely distributed across multiple systems, and where document ownership and access control are reasonably well-maintained even if discovery is hard.
- ✓Use cases where the answer is actually in documents: policies, procedures, product specs, technical documentation, past decisions. The agent can retrieve and synthesize; it can't invent knowledge that doesn't exist somewhere in the corpus.
- ✓Teams willing to use the gap report as a driver of documentation investment. The agent amplifies existing content and makes absences visible, which creates pressure to close those gaps. Organizations that treat gap reports seriously see compounding improvements.
Not a fit when
- ×Organizations with unclear or broken access control. If SharePoint permissions are inherited inconsistently, the agent will surface documents it shouldn't, or hide documents users should see. Fix access control before deployment.
- ×Knowledge bases where the source of truth is someone's head, not a document. The agent can index documentation, transcripts, and chat; it can't index tacit knowledge. For these environments, the agent is complementary to a deliberate knowledge-capture effort, not a substitute.
- ×Use cases where currency requirements are extreme (e.g. minute-by-minute operational procedures during an incident). Content that moves faster than the indexing cadence won't feel fresh enough.
Technology Stack
Integrates with
Industries We Serve
Frequently Asked Questions
How do you handle documents that are outdated or contradict each other?+
Can the system access our SharePoint and respect existing permission groups?+
What happens when the AI gives a wrong answer?+
How much content can the system handle, and is there a limit to knowledge base size?+
How does the agent handle edge cases it hasn't seen before?+
What happens when the agent is wrong?+
How do we audit every decision?+
How long to production?+
Related reading
AI Agents vs Chatbots: They're Not the Same Thing
Every week someone tells me they want to build an AI agent when what they actually need is a chatbot. Or worse, they build a chatbot when they need an agent. Here's how to tell the difference.
Build AI In-House vs Hire a Consultancy: The Real 2026 Cost Comparison
The build vs buy decision for AI is more nuanced than most comparisons suggest. Here is what the full cost of each path actually looks like in 2026.
What Does Enterprise RAG Actually Cost? A Breakdown
Enterprise RAG costs range from $40K to $150K+ to build, with $2K-$8K in monthly ongoing costs. Here is a full breakdown by component so you can budget accurately.
Ready to build this for your team?
We take this from concept to production deployment. Usually in 3–6 weeks.
Start Your Project →