TEST PROJECT COMPLETED

AI LLM Model
Research & Training

Our active conversational chatbot testing phase has officially reached its final epoch and is now offline. Thank you to all researchers, beta testers, and machine learning enthusiasts who participated.

Stay connected for future builds

I am moving on to the next iteration of advanced cognitive systems, agentic architectures, and fine-tuning experiments. Follow my professional journey on LinkedIn for technical breakdowns, papers, and future tools.

Follow Robert Mwanzia

🤖

AI-Powered Support System Overview

This system combines a local Large Language Model (LLM) with a document knowledge base, allowing users to ask questions and receive accurate answers sourced directly from company policies, manuals, and support documentation.

🚧 Tech Stack

Backend: Python, Flask & Nginx
Local Engine: Ollama + Gemma 2B
Retrieval Engine: FAISS Vector Search
Embeddings: Sentence Transformers
Architecture: RAG (Retrieval-Augmented Gen)
Frontend: HTML, CSS, and vanilla JS

🧠 How It Works (RAG Pipeline)

1. Load Docs: PDFs are parsed and loaded from /docs

2. Chunk & Embed: SentenceTransformers create vector embeddings for text chunks

3. Vector Storage: Embeddings are mapped and stored within the FAISS indices

4. Retrieve & Generate: User queries pull closest chunks, feeding Gemma 2B for grounded generation

🧱 Project Structure

ai_support_app/ ├── app.py # Flask backend API endpoint containing route routing and RAG hooks ├── rag.py # Core Retrieval Engine (PDF chunking + local FAISS pipeline orchestration) ├── requirements.txt # Package manifests (ollama, faiss-cpu, sentence-transformers, flask) ├── docs/ # Local PDF Knowledgebase storage │ ├── policy.pdf # Company HR Policies, onboarding documentation │ └── manual.pdf # Technical SOP operating protocols and infrastructure guides ├── index.faiss # Local vector database serialized index (auto-generated during runtime) ├── templates/ │ └── index.html # Beautiful real-time operational interface with dynamic canvas elements └── static/ ├── app.js # Asynchronous socket/HTTP message stream handlers for instant response UI └── style.css # Modern style declarations with glassmorphism components

💡 Potential Use Cases

✔ IT Help Desk: Automated technical issues diagnostic & routing.

✔ HR Policy Portals: Secure employee benefit & general manual queries.

✔ Client Rep Support: Fast retrieval from complex technical documents.

✔ Documentation Assistants: Auto-retrieve information across thousands of pages.

✔ SOP Compliance: Safe and fully audited protocol guidelines in real-time.

✔ Educational Systems: Grounded study guide extraction and testing.

🔒 Strict Data Privacy & Local Edge Compliance

Unlike cloud-based AI solutions that require sending data to external providers, this system runs entirely on local infrastructure. Sensitive documents, internal policies, customer records, and proprietary knowledge remain within the organization's environment.

🔸 No data leakage risk

🔸 Local compliance readiness

🔸 No cloud API subscription cost

🔸 Zero cloud dependencies

🔸 Runs on isolated / air-gapped systems

🔸 Reduced model hallucinations

💫 Test Deployment: Linode VM instance configured with 4 CPUs, 8GB RAM, and no GPUs for live public testing for 4 hours!

Deployment Status

Archived

Server Offline

Observed System Benchmarks

CPU (%) Benchmark Linode 4 vCPUs

Max Peak

91.17 %

Average

3.20 %

Last

9.81 %

Disk I/O (blocks/s) I/O Rate & Swap

Max Rate

107.69

Average

1.47

Max Swap

0.13

Network IPv4 Public In / Out

Public Inbound

Max:19.71 Mb/s

Avg:147.04 Kb/s

Public Outbound

Max:27.51 Kb/s

Avg:1.6 Kb/s

Network IPv6 Public In / Out

Public Inbound

Max:59.17 Mb/s

Avg:441.23 Kb/s

Public Outbound

Max:51.97 Mb/s

Avg:423 b/s

Project Discussion & Interactions

Read technical implementation notes, join active community interactions, and view demo conversations under the main release post.

Interact on the LinkedIn Post

AI LLM Model Research & Training