USE CASE

AI Customer Support Chatbot

GPT-powered chatbot trained on your business data — answers FAQs, routes complex queries, and logs insights.

Python OpenAI FastAPI ChromaDB

The Problem

A service company gets 100+ customer queries daily — mostly repetitive questions about pricing, features, and processes. The support team spends 70% of their time on FAQs instead of complex issues.

The Solution

An AI chatbot that's trained on the company's documentation, FAQs, and product info. Uses RAG (Retrieval-Augmented Generation) to give accurate, contextual answers. Escalates to humans when confidence is low.

Architecture

User Query

Chat widget / API

Vector Search

ChromaDB / embeddings

GPT Response

OpenAI + context

Answer / Escalate

Respond or hand off

Step-by-Step Execution Flow

Step 1: Document Ingestion & Embedding

Load your business documents into a vector database for semantic search.

# ingest.py — Run once to load documents
import os
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

# Load documents from a folder
loader = DirectoryLoader("./docs", glob="**/*.md", loader_cls=TextLoader)
documents = loader.load()

# Split into chunks
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

# Create embeddings and store in ChromaDB
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
    chunks,
    embeddings,
    persist_directory="./chroma_db"
)
print(f"Ingested {len(chunks)} chunks from {len(documents)} documents")

Step 2: RAG Chatbot Core

The chatbot retrieves relevant context from the vector store before generating a response.

# chatbot.py
import os
from openai import OpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

client = OpenAI()
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma(
    persist_directory="./chroma_db",
    embedding_function=embeddings
)

SYSTEM_PROMPT = """You are a helpful customer support assistant for Sinkur.
Answer questions based ONLY on the provided context. If the context doesn't
contain the answer, say "I don't have that information" and offer to
connect the user with a human agent. Be concise and professional."""

def get_response(user_message: str, chat_history: list) -> dict:
    # 1. Retrieve relevant context
    docs = vectorstore.similarity_search(user_message, k=3)
    context = "\n\n".join([doc.page_content for doc in docs])

    # 2. Build messages with context
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "system", "content": f"Context:\n{context}"},
        *chat_history,
        {"role": "user", "content": user_message}
    ]

    # 3. Generate response
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        max_tokens=400,
        temperature=0.3
    )

    reply = response.choices[0].message.content

    # 4. Check if escalation needed
    needs_escalation = any(phrase in reply.lower() for phrase in [
        "i don't have that information",
        "connect you with",
        "human agent"
    ])

    return {
        "reply": reply,
        "escalate": needs_escalation,
        "sources": [doc.metadata.get("source", "") for doc in docs]
    }

Step 3: FastAPI Backend

# main.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from chatbot import get_response
import logging

app = FastAPI(title="Sinkur Support Chatbot")

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://sinkur.com"],
    allow_methods=["POST"],
    allow_headers=["Content-Type"]
)

logger = logging.getLogger("chatbot")

class ChatRequest(BaseModel):
    message: str
    history: list = []
    session_id: str = ""

@app.post("/api/chat")
async def chat(request: ChatRequest):
    result = get_response(request.message, request.history)

    # Log for analytics
    logger.info(f"Session: {request.session_id} | "
                f"Query: {request.message} | "
                f"Escalated: {result['escalate']}")

    return result

Step 4: Web Chat Widget (JavaScript)

// chat-widget.js — embed on any page
(function() {
    const API_URL = '/api/chat';
    let history = [];
    const sessionId = crypto.randomUUID();

    async function sendMessage(message) {
        // Add user message to UI
        addMessageToUI('user', message);

        const response = await fetch(API_URL, {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({
                message: message,
                history: history,
                session_id: sessionId
            })
        });

        const data = await response.json();

        // Update history
        history.push({ role: 'user', content: message });
        history.push({ role: 'assistant', content: data.reply });

        // Show response
        addMessageToUI('assistant', data.reply);

        // If escalation needed, show human handoff button
        if (data.escalate) {
            showEscalationOption();
        }
    }
})();

Business Impact

70% queries automated

Handles FAQs instantly

24/7 availability

No wait times

Customer insights

Know what customers ask

Team focuses on complex

Higher-value support

Tech Stack

LLM: OpenAI GPT-4o-mini (cost-effective, fast)
Embeddings: OpenAI text-embedding-3-small
Vector DB: ChromaDB (local) or Azure AI Search (production)
Framework: LangChain + FastAPI
Frontend: Vanilla JS chat widget
Deployment: Azure App Service (Docker container)

Want an AI Chatbot for Your Business?

We'll train it on your content, deploy it on your site, and keep it updated — all within weeks.

Get Started