Telugu OCR Speech App

Заказчик: AI | Опубликовано: 05.12.2025

I need an end-to-end system that lets a user open a mobile app, snap or choose a scanned page from any Telugu book, and instantly hear the text read aloud while the backend checks whether the same passage already exists in my book-preview dataset. Here is the flow I have in mind: • The mobile app (must run on both iOS and Android) presents a simple “Upload / Capture” screen. • On upload, the image is sent to Google Cloud where OCR is performed. The OCR must handle multiple Telugu fonts and return clean Unicode. • The extracted text is fed to a Telugu Text-to-Speech engine and the audio stream is returned to the app. • In parallel, the text is compared against my preview catalogue, currently stored as folder of pdfs together with the corresponding PDFs. If a match is found, the service returns “found” if found retrieve the pdf from dataset and a link to download the PDF with a timestamp appended. If not found, it simply returns “not found.” I will provide the PDFs. You will deliver: 1. Cross-platform mobile app source and builds. 2. Google Cloud backend code (OCR, TTS, text-matching logic, secure API). 3. A small Firestore / Cloud SQL setup or similar for hosting the dataset converted from Excel, with a routine that keeps it in sync. 4. Deployment scripts or Terraform so I can reproduce the setup in my own GCP project. 5. Brief documentation and a test video showing the workflow from image capture to audio playback and dataset response. Acceptance criteria: the app recognises at least 90 % of text on standard printed pages, returns audio within five seconds for a typical page, and reliably flags matches in the dataset. If you have experience with tools like Tesseract, Google Vision API, Cloud Text-to-Speech, Flutter or React Native, and efficient string-matching on large text collections, let’s get started. also attached the sample dataset with 1 pdf