I want to build a self-hosted language model that can run entirely offline and can be trained on data such as pdfs, images and word documents that I consistently feed and act as our first-line internal support agent. Its main duties will be answering common FAQs and providing technical guidance when users run into problems with our product. Because the model must operate without an internet connection, every component—model weights, retrieval pipeline, vector store, and UI—has to live on our own hardware. I have not settled on a specific tech stack yet, so I am open to your recommendation, whether that ends up being Python (e.g., LangChain, llama.cpp, or Haystack), a JavaScript-based solution like Transformers.js, or even a JVM approach. Whichever route you choose, please be sure it can be installed and updated in an air-gapped environment. Deliverables • A fully functional offline LLM fine-tuned or otherwise adapted to our support domain • A local retrieval or knowledge-base mechanism so the model can pull accurate answers from our existing documentation • Simple desktop or browser UI for agents to interact with the model and review responses • Clear setup script plus step-by-step deployment instructions so our team can reproduce the environment on fresh hardware • A short hand-off session to walk us through maintenance, model updates, and expanding the knowledge base If you have prior experience with offline language models that work lets chat