Lightweight VLM Automotive Defect Detection

I want to train and deploy a compact vision-language model that automatically flags visual defects on automotive components. You are free to choose any lightweight VLM architecture—CLIP, OWL-ViT, LLaVA, MiniGPT-4 or another option that can be fine-tuned efficiently and run inference on a single GPU or modest edge device. Scope • Build or fine-tune the model on an annotated image dataset (I can share what I have or we can augment it together). • Enable detection of common issues such as cracks, deformations, surface scratches and leave the framework flexible enough to add more defect classes later. • Deliver an inference pipeline that accepts new photos and returns bounding boxes, class labels and confidence scores. • Provide clear instructions so I can reproduce training, evaluate performance and integrate the model into a larger quality-control system. Preferred stack: Python with PyTorch or TensorFlow; Hugging Face libraries are welcome. Lightweight deployment through ONNX or TensorRT is a plus. Deliverables - Trained model weights and complete training code - Inference script/API endpoint - README explaining setup, retraining steps and hardware requirements I’m ready to start immediately and will give prompt feedback on milestones and test results.

Python

Регистрация