Backend Engineer for RAG in Healthcare Setting

Customer: AI | Published: 07.03.2026
Бюджет: 250 $

Backend Engineer Needed for RAG Architecture (LLM + Vector Database) --- Project Description I am looking for an experienced developer to design and implement a production-grade RAG (Retrieval-Augmented Generation) backend for a controlled knowledge system. This is not a generic chatbot project. The goal is to build a system where a language model can query a structured knowledge base while enforcing strict safety and retrieval rules. The system will be used in a clinical / healthcare-related context, so accuracy, traceability and rule enforcement are critical. --- Current Infrastructure The current setup includes: WordPress website hosted on Hostinger (content origin / source of truth) Cloud environment on Hetzner available for backend deployment Initial automation infrastructure already started The WordPress site must remain separate from the AI backend. It acts only as the knowledge source, not as the execution environment for the AI system. --- Objective Build a RAG pipeline that allows an LLM to query a curated knowledge base while enforcing strict retrieval rules and preventing unsupported answers. --- Key System Rules The system must enforce the following rules: 1. No-retrieval = No-answer If the system cannot retrieve relevant evidence from the knowledge base, the LLM must not generate an answer. The pipeline should stop and return a controlled response instead. --- 2. Evidence-based responses All responses must be grounded in retrieved source chunks. The model must not: invent information infer protocols generate details not present in the sources --- 3. Context separation The system must prevent cross-context mixing between different topics or treatment types. Retrieval should use semantic filtering and metadata constraints to ensure the correct context. --- 4. Traceability Responses should allow traceability to: source chunks document IDs content versioning when the website is updated --- Expected Architecture Content Origin WordPress / Hostinger RAG Backend ingestion pipeline semantic chunking embeddings vector database retrieval + validation layer LLM Layer Invoked only after successful retrieval Rules must be enforced outside the model, in the orchestration layer. --- Infrastructure Possible stack: Python backend Vector database (Pinecone, Weaviate, Qdrant, etc.) Hetzner infrastructure API layer for query handling Content ingestion may come from: controlled exports PDFs structured feeds limited and controlled scraping if necessary --- Deliverables RAG backend architecture ingestion pipeline vector database configuration retrieval and validation logic API layer deployment instructions documentation --- Screening Question Please answer this question in your proposal: How would you implement a strict “no-retrieval = no-answer” rule in a production RAG architecture? --- Ideal Candidate Experience building RAG systems Experience with vector databases Python backend development Experience with LLM orchestration Ability to design production architectures --- Budget Open to proposals depending on experience and scope. --- Consejo importante Cuando publiques en Freelancer: selecciona AI / Machine Learning selecciona Python selecciona Backend Development Eso hace que el proyecto llegue a ingenieros correctos y no solo a programadores web.