Subject: Project Brief: AI-Powered Media Planning Automation & Data Migration Hello, We aim to transform our advertising agency's manual presentation operations—currently involving hundreds of slides—into a database-driven automation system. The goal of this project is to recover scattered legacy data and build a structure that generates custom media plan presentations (PPTX/PDF) in seconds based on client needs. The project consists of two main phases: Phase 1: Data Migration (Legacy PPTX to Database) We currently have a backlog of PowerPoint (PPTX) files containing hundreds of inventory items. In these files, each slide represents a specific advertising unit (Billboard, Racket/CLP, etc.). Critical data (Price, Dimensions, Location) and images are embedded directly within these slides. Instead of manual migration, we want to automate this process using Python and LLM (GPT-4o Vision) technologies. Expected Workflow (The Script): The script scans the directory for PPTX files. Converts each slide into a high-resolution image (JPG/PNG). Sends this image to the OpenAI GPT-4o API. Using the "System Prompt" provided below, parses the visual data (Location Name, Price, Dimensions, Map Coordinates, and Type) into a structured JSON format. Crops and saves the clean inventory image from the slide to the file system. Writes the extracted clean data and image paths as rows into a Database (PostgreSQL or Airtable). Phase 2: Presentation Generator Engine Once the data is centralized, I require an automation engine that works based on my "New Corporate Presentation Template." Required Features: Simple Interface (UI): A simple web-based panel. Filtering: I should be able to filter inventory via the panel (e.g., "Beşiktaş Region", "Rackets/CLP only", "Max Budget 50,000 TL") or enter a chat-based prompt. Output: The system queries the database, retrieves the selected inventory, and populates the placeholders (Image, Price, Text boxes) in the new template to generate an editable .pptx file. Technology Stack: The python-pptx library is suggested for the backend generation. Technical Notes & Prompt For the data extraction process in Phase 1, the following System Prompt should be used. I expect the script to generate the JSON output based on this structure: (Developer Note: This prompt is designed in English to ensure data extraction accuracy) System Role: You are an expert Data Extraction AI specialized in OOH (Out of Home) advertising inventory. Your task is to analyze an image of a presentation slide and extract structured data into a strict JSON format. Target JSON Structure: { "location_name": "string", "city": "string", "unit_type": "string", "dimensions": { "width": number, "height": number, "unit": "cm" }, "price_info": { "amount": number, "currency": "string" }, "coordinates": "string or null" } Summary of Expectations: My primary requirement is writing the script to convert the pile of PPTX files into a meaningful, structured database, and subsequently building the mechanism to generate new presentations using this data within my new template. I look forward to your technical approach, estimated timeline, and budget proposal for this project. Best regards,