AI Data Cleansing & Enrichment Platform

Заказчик: AI | Опубликовано: 20.10.2025
Бюджет: 1500 $

I need a web-based, AI-assisted platform that ingests mixed data—text and numerical—and automatically cleanses and enriches it. I’m flexible on hosting, so you may deploy to either AWS or Google Cloud; advise which stack best suits the build and future scaling. Core requirements • Automated removal of duplicates, correction of errors, and standardisation of formats. • A “Dashobard” (spelled as provided) that contrasts the dataset before and after processing and surfaces key metrics for quick “Analysis (Before and after)”. • An “Iddntify” module that flags outliers, incomplete records, or suspicious values for manual review. • Enrichment layer that appends or infers additional fields using NLP or other AI models. • Simple, responsive web UI plus a REST/JSON API so other systems can push and pull data. • Role-based access, basic audit logging, and secure storage of all processed files. Deliverables 1. Source code (Python preferred, but I’m open to alternatives) with clear build instructions. 2. Cloud infrastructure scripts or templates (CloudFormation, Terraform, or Deployment Manager) to spin up the environment on AWS or Google Cloud. 3. Model training pipeline and any pre-trained weights required for the cleansing logic. 4. The operational Dashobard, Analysis (Before and after) view, and Iddntify alert panel fully wired to the back-end. 5. Concise user guide and hand-off documentation. Quality bar • Accuracy of cleansing >95 % on the provided sample data. • UI loads in under two seconds for a 50 K-row dataset. • Codebase lint-clean and unit-tested (minimum 80 % coverage). If you’ve built similar ETL or data quality tools on AWS, Google Cloud, or both, I’d love to see a brief demo link or repo. more info in the attached scope.