Advanced Subtitle & PDF Translation App

Заказчик: AI | Опубликовано: 02.12.2025
Бюджет: 750 $

Project Title: Professional App for Subtitles Creation, Styling, Burning & PDF Translation Project Description: I want to develop a paid application (mobile and/or web) that can automatically create, edit, style, and burn subtitles onto audio and video files. The app should also be able to process and translate large PDF files, including long documents like e-books. The goal is to create a high-quality, easy-to-use tool for generating accurate subtitles, designing them visually, translating text, and exporting everything cleanly. Main Features Needed: • Upload support for video/audio files (mp4, mov, mkv, mp3, wav…) and large PDF files with many pages. • Automatic speech-to-text (ASR/STT) with high accuracy in Hebrew and English, including punctuation and timecodes. • Automatic subtitle synchronization with editable timecodes and manual adjustment tools. • Advanced subtitle editor: editing text, splitting/merging lines, search & replace, spell-check, etc. • Professional subtitle styling: fonts, colors, size, shadows, outlines, positioning, presets, and support for advanced subtitle formats (SRT, VTT, ASS/SSA). • Hard-burning subtitles into the video and exporting soft-subs as separate files or inside MP4/MKV. • High-quality translation for subtitles, audio, and PDF text. Should support contextual translation, not just literal. • PDF processing: OCR for scanned PDFs, extracting text while keeping layout, translating sections/chapters, and exporting as PDF/EPUB. • Basic video/audio trimming and FFmpeg-based rendering. • User system and billing: premium accounts, usage-based pricing (per minute/page), payment integration (Stripe/PayPal/credit card). • Cloud project saving and file storage (S3 or similar). • Project history, version tracking, and shareable links. Recommended Tech Stack (Open to suggestions): Backend: Node.js or Python (Django/Flask), REST/GraphQL API Mobile: Flutter or React Native Web: React ASR/STT: WhisperX, Google Speech, or Azure Speech OCR for PDF: Tesseract / OCRmyPDF Translation: Google/Microsoft Translation API or integrated LLM translation Video processing: FFmpeg Database: PostgreSQL + cloud file storage (S3) Task queue for heavy jobs: Celery / Redis / RabbitMQ User Flow Overview: User uploads a video/audio file or a PDF. Selects source language and target language (if translation is needed). System generates subtitles or extracts PDF text automatically. User edits and styles subtitles in a clean editor. System burns subtitles or exports them as files. User downloads results or shares a link. Payment is handled based on usage or subscription. Deliverables: • Fully working app or web-app • Complete backend with documented API • UI/UX design (Figma) • FFmpeg integration • OCR + ASR integration • Deployment instructions • Example input/output files • Testing (unit + integration) Milestones (suggested): Wireframes + full specification MVP: upload → ASR → basic subtitle editor → export SRT Full version: styling, burning, translation, PDF OCR Payments + cloud storage + optimization Final delivery + documentation + support period