I’ve been diving into AI engineering concepts and recently I’ve been working with an AI model from huggingface, and using LocalAI, qdrant, and python to develop a simple Q and A chatbot, that can return information from any pdf file! I got started by following Zen Van Riel’s excellent course in the Skool community. If you’re interested in learning AI engineering, I highly recommend checking out his community: AI Engineer on Skool.
Key Components
The current implementation runs LocalAI and Qdrant in Docker containers, with models downloaded locally via a script I created called download_models
. This setup forms the foundation of everything that makes the Q&A functionality work.
The PDF Q&A system consists of two main servers:
- Embeddings Server: Transforms text chunks into vector representations that capture semantic meaning
- PDF Q&A Server: Orchestrates the entire process, handling:
- PDF uploads and text extraction
- Communication with the embeddings service
- Vector similarity search (via Qdrant)
- Question processing
- Streaming results back to the user
Future Goals
While I’m happy with the current implementation, there’s still more I want to do:
- Integrate Local Model with GPU acceleration: Run AI models directly on my machine to leverage GPU acceleration rather than using containers.
- Cloud AI Integration: Experiment with various cloud AI services to compare performance and capabilities.
- Home Lab Setup: Configure my old gaming PC as a dedicated workspace for AI experimentation.
Try It Yourself
You can find the complete code on GitHub: eccentricnode/qa-chatbot
If you’re interested in AI engineering and building similar projects, I highly recommend checking out Zen Van Riel’s course in the AI Engineer Skool community. It’s been invaluable in helping me understand both the theoretical concepts and practical implementation details.