Java E-commerce Data Crawler

Заказчик: AI | Опубликовано: 09.11.2025
Бюджет: 50 $

I need a small yet robust Java program that crawls selected e-commerce sites, extracts product data, and returns it in a clean, structured format. The crawler should navigate through category and product pages, fetch the HTML, and parse the following details with regular expressions or similarly reliable pattern-matching techniques: • Product name and full description • Current price plus stock / availability status • Customer review text and rating, where present Key points – Written entirely in Java 8+; feel free to lean on libraries such as Jsoup or Apache HttpClient for fetching and parsing, but the matching logic itself must be regex- or pattern-based so I can tweak it later. – Input is a simple list of URLs (or a seed category page) that I can drop into a config file. – Output should be JSON or CSV—whichever is faster for you to implement—containing the three data groups above. – Please include brief build/run instructions and inline comments showing where to adjust patterns for different site layouts. – The crawler has to respect robots.txt and handle polite delays between requests. I’ll test by pointing the finished JAR at two live e-commerce domains and comparing the generated file with the on-page data; if the three product sections match for at least 95 % of items scanned, the job is done.