2.9 KiB
2.9 KiB
Price Tracker - Scraper Architecture
Current Structure
1. scraper.py - Base Scraper Class
- Purpose: Foundation class for all price scraping
- Handles: Generic e-commerce sites (Amazon.com, eBay, Walmart, etc.)
- Key Features:
- Base
PriceScraperclass with HTTP session management - Anti-bot measures (headers, delays, retries)
- Generic price extraction methods
- Site detection logic
- Base
2. uk_scraper.py - UK Catering Specialist
- Purpose: Specialized scraper for UK catering supply websites
- Handles: JJ Food Service, A to Z Catering, Amazon UK
- Key Features:
- Inherits from
PriceScraperbase class - UK currency handling (£ symbol)
- Delivery vs Collection price prioritization
- Special pricing detection (offers, strikethrough, was/now pricing)
- Site-specific CSS selectors (e.g.,
.my-price.price-offerfor A to Z)
- Inherits from
3. scraper_manager.py - Orchestration Layer
- Purpose: Routes scraping tasks to appropriate scrapers
- Logic:
- Detects UK catering sites → uses
UKCateringScraper - Detects other sites → uses base
PriceScraper - Manages concurrent requests and error handling
- Detects UK catering sites → uses
Site Mapping
UK Catering Sites (UKCateringScraper):
jjfoodservice→ JJ Food Serviceatoz_catering→ A to Z Cateringamazon_uk→ Amazon UK
International Sites (PriceScraper):
amazon→ Amazon.comebay→ eBaywalmart→ Walmart- (Future sites can be added here)
Key Benefits of Current Structure
✅ Separation of Concerns: UK-specific logic is isolated ✅ Extensibility: Easy to add new UK sites or international sites ✅ Maintainability: Changes to UK logic don't affect international scraping ✅ Specialization: UK scraper handles currency, delivery pricing, special offers
Recommendations
✅ KEEP CURRENT STRUCTURE - It's well-designed!
The separation between scraper.py and uk_scraper.py is actually good architecture because:
- UK catering sites have unique requirements (delivery vs collection, £ pricing, special offers)
- International sites have different patterns (USD pricing, different site structures)
- Easy to maintain and extend each scraper independently
Minor Improvements Made:
- Enhanced site detection in base scraper
- Added helper methods to determine scraper routing
- Improved scraper manager logic for clarity
- Fixed A to Z pricing with
.my-price.price-offerselector
Final File Structure
src/
├── scraper.py # Base scraper (international sites)
├── uk_scraper.py # UK catering specialist
├── scraper_manager.py # Orchestration layer
├── config.py # Configuration management
├── database.py # Data persistence
└── web_ui.py # Flask web interface
This structure supports both current UK catering needs and future expansion to international e-commerce sites.